Category : AI & Automation

AI & Automation

Generative AI investments: how to estimate funding for GenAI projects

generative AI investments

In a Jan 2024 survey by Everest Group, 68% of CIOs pointed out budget concerns as a major hurdle in kickstarting or scaling their generative AI investments. Just like estimating costs for legacy software, getting the budget right is crucial for generative AI projects. Misjudging estimates can lead to significant time loss and complications with resource management.

Before diving in, it’s essential to ask: Is it worth making generative AI investments now, despite the risks and the ever-changing landscape, or should we wait? 

Simple answer: Decide based on risk and the ease of implementation. It’s evident that generative AI is going to disrupt numerous industries. This technology isn’t just about doing things faster; it’s about opening new doors in product development, customer engagement, and internal operations. When we speak with tech leaders, they tell us about the number of use cases pitched by their teams. However, identifying the most promising generative AI idea to pursue can be a maze in itself. 

This blog presents a practical approach to estimating the cost of generative AI projects. We’ll walk you through picking the right use cases, LLM providers, pricing models and calculations. The goal is to guide you through the GenAI journey from dream to reality. 

Choosing Large Language Models (LLMs) 

When selecting an LLM, the main concern is budget. LLMs can be quite expensive, so choosing one that fits your budget is essential. One factor to consider is the number of parameters in the LLM. Why does this matter? Well, the number of parameters provides an estimate of both the cost and the speed of the model’s performance. Generally, more parameters mean higher costs and slower processing times. However, it’s important to note that a model’s speed and performance are influenced by various factors beyond just the number of parameters. However, for this article’s purpose, consider that it provides a basic estimate of what a model can do.  

Types of LLMs 

There are three main types of LLMs: encoder-only, encoder-decoder, and decoder-only. 

  1. Encoder-only model: This model only uses an encoder, which takes in and classifies input text. It was primarily trained to predict missing or “masked” words within the text and for next sentence prediction. 
  2. Encoder-decoder model: These models first encode the input text (like encoder-only models) and then generate or decode a response based on the now encoded inputs. They can be used for text generation and comprehension tasks, making them useful for translation. 
  3. Decoder-only model: These models are used solely to generate the next word or token based on a given prompt. They are simpler to train and are best suited for text-generation tasks. Models like GPT, Mistral, and LLaMa fall into this category. Typically, if your project involves generating text, decoder-only models are your best bet. 

Our implementation approach 

At Robosoft, we’ve developed an approach to solving client problems. We carefully choose models tailored to the use case, considering users, their needs, and how to shape interactions. Then, we create a benchmark, including cost estimates. We compare four or five models, analyze the results, and select the top one or two that stand out. Afterward, we fine-tune the chosen model to match clients’ preferences. It’s a complex process, not simple math, but we use data to understand and solve the problem. 

 generative AI investments

Where to start? 

Start with smaller, low-risk projects that help your team learn or boost productivity. Generative AI relies heavily on good data quality and diversity. So, strengthen your data infrastructure by kicking off smaller projects now, ensuring readiness for bigger AI tasks later.


Generative AI investments

In a recent Gartner survey of over 2,500 executives, 38% reported that their primary goal for investing in generative AI is to enhance customer experience and retention. Following this, 26% aimed for revenue growth, 17% focused on cost optimization, and 7% prioritized business continuity. 

Generative AI investmentsBegin with these kinds of smaller projects. It will help you get your feet wet with generative AI while keeping risks low and setting you up for bigger things in the future. 

Different methods of implementing GenAI 

There are several methods for implementing GenAI, including RAG, Zero Shot, One Shot, and Fine Tuning. These are effective strategies that can be applied independently or combined to enhance LLM performance based on task specifics, data availability, and resources. Consider them as essential tools in your toolkit. Depending on the specific problem you’re tackling, you can select the most fitting method for the task at hand. 

  • Zero shot and One shot: These are prompt engineering approaches. The zero-shot approach involves the model making predictions without prior examples or training on the specific task, suitable for simple, general tasks relying on pre-trained knowledge. One Shot involves the model learning from a single example or prompt before making predictions, which is ideal for tasks where a single example can significantly improve performance. 
  • Fine tuning: This approach further trains the model on a specific dataset to adapt it to a particular task. It is necessary for complex tasks requiring domain-specific knowledge or high accuracy. Fine tuning incurs higher costs due to the need for additional computational power and training tokens. 
  • RAG (Retrieval-Augmented Generation): RAG links LLMs with external knowledge sources, combining the retrieval of relevant documents or data with the model’s generation capabilities. This approach is ideal for tasks requiring up-to-date information or integration with large datasets. RAG implementation typically incurs higher costs due to the combined expenses of LLM usage, embedding models, vector databases, and compute power. 

Key factors affecting generative AI investments (Annexure-1)

  • Human Resources: Costs associated with salaries for AI researchers, data scientists, engineers, and project managers. 
  • Technology and Infrastructure: Expenses for hardware (GPUs, servers), software licensing, and cloud services. 
  • Data: Costs for acquiring data, as well as storing and processing large datasets. 
  • Development and Testing: Prototyping and testing expenses, including model development and validation. 
  • Deployment: Integration costs for implementing AI solutions with existing systems and ongoing maintenance. 
  • Indirect costs: Legal and compliance and marketing and sales. 

Elements of LLMs

LLM pricing  

Once you choose the implementation method, you must decide LLM service (refer table 1 below) and then work on prompt engineering — that’s part of software engineering. 

Commercial GenAI products work on a pay-as-you-go basis, but it’s tricky to predict their usage. When building new products and platforms, especially in the early stages of new technologies, it’s risky to rely on just one provider. 

For example, if your app serves thousands of users every day, your cloud computing bill can skyrocket. Instead, we can achieve similar or better results using a mix of smaller, more efficient models at lower cost. We can train and fine-tune these models to perform specific tasks, which can be more cost-effective for niche applications.  Generative AI providers and costing 2024In the above table 1, “model accuracy” estimates are not included because they differ based on scenarios and cannot be quantified. Also note that the cost may vary. This is the current (as of July 2024) cost listed on the provider’s website. 

Generative AI pricing based on the implementation scenario 

Let’s consider typical pricing for the GPT-4 model for the below use cases. 

Here are some assumptions: 

  • We’re only dealing with English. 
  • Each token is counted as 4 letters. 
  • Input: $0.03 per 1,000 tokens 
  • Output: $0.06 per 1,000 tokens 

Use case calculations – Resume builder 

When a candidate generates a resume using AI, the system collects basic information about work and qualifications, which equates to roughly 150 input tokens (about 30 lines of text). The output, including candidate details and work history, is typically around 300 tokens. This forms the basis for the input and output token calculations in the example below.

GenAI use case resume builder

Let’s break down the cost. 

Total Input Tokens: 

  • 150 tokens per interaction 
  • 10,000 interactions per month 
  • Total Input Tokens = 150 tokens * 10,000 interactions = 1,500,000 tokens 

Total Output Tokens: 

  • 300 tokens per interaction 
  • 10,000 interactions per month 
  • Total Output Tokens = 300 tokens * 10,000 interactions = 3,000,000 tokens 

Input Cost: 

  • Cost per 1,000 input tokens = $0.03 
  • Total Input Cost = 1,500,000 tokens / 1,000 * $0.03 = $45 

Output Cost: 

  • Cost per 1,000 output tokens = $0.06 
  • Total Output Cost = 3,000,000 tokens / 1,000 * $0.06 = $180 

Total Monthly Cost: 

Total Cost = Input Cost + Output Cost = $45 + $180 = $225 

How to calculate generative AI cost ROI

RAG implementation cost  

Retrieval Augmented Generation (RAG) is a powerful AI framework that integrates information retrieval with a foundational LLM to generate text. In the case of resume builder use case, RAG retrieves relevant data based on the latest information without the need for retraining or fine-tuning. By leveraging RAG, we can ensure the generated resumes are accurate and up-to-date, significantly enhancing the quality of responses. 

Generative AI RAG based cost 

Fine tuning cost

It involves adjusting a pre-trained AI model to better fit specific tasks or datasets, which requires additional computational power and training tokens, increasing overall costs. For example, if we fine-tune the Resume Builder model to better understand industry-specific terminology or unique resume formats, this process will demand more resources and time compared to using the base model. Therefore, we are not including the cost for this use case.

Summary of estimating generative AI cost 

To calculate the actual cost, follow these steps: 

  1. Define use case: E.g. Resume builder
  2. Check cost of LLM service: Refer to table 1. 
  3. Check RAG implementation cost: Refer table 3.
  4. Combine costs: LLM service, RAG cost, and calculate additional costs (Annexure-1) such as hardware, software licensing, development and other services. 

The rough estimate would be somewhere between $150,000 to $2,50,000. These are just the ballpark figures. The costs may vary depending on your needs, LLM service, location, and market condition. It’s advisable to talk to our GenAI experts for a precise estimate. Also, keep an eye on the prices of hardware and cloud services because they keep updating. 

You can check out some of our successful enterprise projects here. 

GenAI reducing data analytics cost

At Robosoft, we believe in data democratization—making information and data insights available to everyone in an organization, regardless of their technical skills. A recent survey shows that 32% of organizations already use generative AI for analytics. We’ve developed self-service business intelligence (BI) solutions and AI-based augmented analytics tools for big players in retail, healthcare, BFSI, Edtech, and media and entertainment. With generative AI, you can also lower data analytics costs by avoiding the need to train AI models from the ground up.

Image source: Gartner (How your Data & Analytics function using GenAI) 

Conclusion

Generative AI investments aren’t just about quick financial gains; they require a solid data foundation. Deploying generative AI with poor or biased data can lead to more than just inaccurate results. For instance, if a company uses biased data in its hiring process, say gender or race, it could discriminate against certain people. In a resume-builder scenario, this biased data might incorrectly label a user, damaging a company’s reputation, causing compliance issues, and raising concerns among investors.

While we write this article, a lot is changing. Our knowledge about generative AI and what it can do might differ. However, our intent of providing value to customers and driving change prevails.

Read More
AI & Automation

Why the Google Gemini Launch Matters

On December 7, Google announced the launch of Gemini, its highly anticipated new multi-modal AI architecture, including a Nano version optimized for hand-held devices. The announcement was greeted with mixed reviews.

Some users expressed doubts about the claims made by Google or whether the Gemini product was significantly better than GPT-4. Quoting an AI scientist who goes simply by the name “Milind,” Marketing Interactive suggested that Google is playing catch up at this point and that OpenAI and Microsoft might be ahead by six months to a year in bringing their AI models to market.

There was also plenty of public handwringing about a promotional video by Google featuring a blue rubber duck because the demo had been professionally edited after it was recorded.

Despite the tempest in a teapot about the little blue rubber duck, we believe the announcement is essential and deserves our full attention.

Decoding Gemini: How Parameters Shape Its Capabilities

Parameters are, roughly speaking, an index to how capable an AI might be. GPT 4.0 was built on 1.75 trillion parameters.

We don’t know how many parameters were used to build Gemini. Still, Ray Fernandez at Technopedia estimated that Google used between 30 and 65 trillion parameters to make Gemini, which, according to SemiAnalysis, would equate to an architecture that might be between 5 and 20x more potent than GPT-4.

Beyond the model’s power, there are at least four points of differentiation for Gemini.

#1. Multi-modal Architecture: Gemini uses multi-modal architecture from the ground up, unlike the competing architectures, which have text, images, video, and code in separate silos, which forces other companies to roll out those capabilities one by one, complicating the ability for them to work together in an optimum way.

#2. Massive Multitask Language Understanding: Gemini scored higher than its competition on 30 out of 32 third-party benchmarks. On some of those, they were only slightly ahead, and on others, more, but overall, that’s an imposing win-loss record.

In particular, Gemini recorded an essential milestone by outscoring human experts on a tough test called Massive Multitask Language Understanding (MMLU). Gemini scored 90.04% versus a human expert performance, which scored 89.8%, according to the benchmark authors.

#3. Alpha Code2 Capabilities: Simultaneously with the launch of Gemini, Google also launched Alpha Code2, a new, more advanced coding capability that now ranks within the top 15% of entrants on the Codeforces competitive programming platform. That ranking represents a significant improvement over its state-of-the-art predecessor, which previously ranked in the top 50% on that platform.

#4. Nano LLM model: Also simultaneous with the launch of Gemini was the Nano LLM model, which is optimized to run on a handheld device, bringing many of Gemini’s capabilities to edge devices like handheld phones and wearables. For now, that’s a unique advantage for Gemini.

points of differentiation for Google Gemini

What are the practical implications of Gemini Nano on a handheld device?

Companies like Robosoft Technologies that build apps will collaborate with clients to test the boundaries of what Nano can do for end users using edge devices like cell phones.

Edge computing emphasizes processing data closer to where it is generated, reducing latency and dependence on centralized servers, and cell phones will undoubtedly be first in line to benefit from Nano because they can perform tasks like image recognition, voice processing, and various types of computations on the device itself.

What about Wearables or other Types of Edge Devices?

Google hasn’t said whether Nano can run on wearables or other edge devices, but its design and capabilities suggest it probably can.

First, Nano is a significantly slimmed-down version of the full Gemini AI model, making it resource-efficient and potentially suitable for devices with limited computational power, like wearables.

Also, Nano is designed explicitly for on-device tasks. It doesn’t require constant Internet connectivity, making it ideal for applications where data privacy and offline functionality are crucial — both are relevant for wearables.

In particular, we noticed that Google’s December 2023 “feature drop” for Pixel 8 Pro showcased a couple of on-device features powered by Nano, including “Summarize” in the Recorder app and “Smart Reply” in Gboard. In our opinion, these capabilities could easily translate to wearables.

What about Apple Technology?

There’s no official indication that Nano is compatible with Apple technology. We think such compatibility is unlikely because Google primarily focuses on Android and its ecosystem.

However, the future of AI development is increasingly open-source and collaborative, so it’s possible that partnerships or independent efforts by members of the AI ecosystem — including companies like Robosoft Technologies — could lead to compatibility between Gemini Nano and Apple devices.

Enterprise-Level Use Cases for Gemini Pro

From what we know so far, Gemini Pro offers good potential to enable or enhance various enterprise-level applications. Here are some critical use cases that we think are most likely to be among the first wave of projects using Gemini Pro.

Customer Service and Workflows

  • Dynamically updating answers to FAQs
  • Helping with troubleshooting
  • Routing questions to the appropriate resources
  • Extracting and summarizing information from documents, forms, and datasets
  • Filling in templates
  • Maintaining databases
  • Generating routine reports

Personalization and Recommendations

  • Creating personalized marketing messages and recommendations
  • Optimizing pricing
  • Automating risk assessments
  • Streamlining loan applications
  • Providing personalized health treatment plans
  • Recommending preventive health measures

Business Process Optimization

  • Identifying process delays
  • Optimizing resource allocation
  • Streamlining decision-making processes with improved information flow
  • Identify cost savings opportunities

Security and Fraud Detection

  • Identifying potential cyber-attacks
  • Identifying malicious code and protecting sensitive data
  • Analyzing financial data for suspicious activity to help prevent losses

Content Moderation and Safety

  • Moderating user comments and posts on social media, including forum discussions
  • Improving the correct identification of spam

Above all, a very foundational use for Google Gemini Pro might be to enable the implementation of an enterprise-level generative AI copilot.

What is an Enterprise-Level Generative AI Copilot?

A generative AI copilot is an advanced artificial intelligence system designed to collaboratively assist and augment human users in various tasks, leveraging productive capabilities to contribute actively to the creative and decision-making processes. This type of technology is customized for specific enterprise applications, learning from user interactions and context to provide tailored support. It goes beyond conventional AI assistants by actively generating real-time suggestions, solutions, or content. It fosters a symbiotic relationship with users to enhance productivity, creativity, and problem-solving within organizational workflows.

Why might Gemini Pro be a good platform for building a generative AI copilot?

We think that Gemini Pro should be considered a possible platform for building a copilot. Its capabilities and characteristics align well with the requirements of such a system.

First, Gemini Pro can process and generate human language effectively, enabling it to understand user intent and respond coherently and informally. It has a knowledge base built on 40 trillion tokens, equivalent to having access to millions of books. It can reason about information, allowing it to provide relevant and insightful assistance to users.

Also, like other generative AI platforms, Gemini Pro can adapt its responses and behavior based on the context of a conversation, helping to ensure that its assistance remains relevant and helpful.

So that’s a good foundation.

Upon such a foundation, Google relies on the partners in its ecosystem to build an overall solution that addresses enterprise needs. These include ensuring that their data is secure. That information inside their enterprise is not used to train public models, control access to the data based on job roles and other factors, help with data integration, and build an excellent user interface. These are examples of areas where technology partners like Robosoft Technologies make all the difference when bringing an AI-based solution to life within an enterprise.

Read More
AI & Automation

Conversational AI breaks through user barriers – Designing a fulfilling conversation is key

Hey Alexa, what is conversational AI? If you’ve ever interacted with a virtual assistant like Siri, Alexa or Google Assistant, then you’ve experienced conversational Artificial Intelligence (AI). These game-changing automated messaging and speech-enabled applications have permeated every walk of life, creating human-like interactions between computers and humans. From checking your appointments and carrying out bank transactions, to tracking the status of your food or delivery order and learning the names of songs, conversational AI will soon be playing a lead role in your digital interactions.

So, how does Conversational AI work?

Users interact with conversational AI through text chats or voice. Simple FAQ chatbots require specific terms to derive responses from their knowledge bank. However, applications based on conversational AI are far more advanced – they can understand intent, provide responses in context, and learn and improve over time. While conversational AI is the umbrella term, there are underlying technologies such as Machine Learning (ML), Natural Language Processing (NLP), Natural Language Understanding (NLU) and Natural Language Generation (NLG) that enable text-based interactions. In the context of voice, additional technologies such as Automatic Speech Recognition (ASR) and text-to-speech software enable the computer to “talk” like a human.

Conversational AI process

Imagine you give a command to a conversational AI application to track your order. This input could either be spoken or text. If spoken, the ASR converts the spoken phrases into machine-readable language. Once converted by ASR, the application then moves into the NLP stage, where it first uses NLU to understand the context and intent of the message. Based on this, a response is formed through a dialogue management system and generated into an understandable format by NLG. The response is then either delivered in text, or in the case of voice, converted to speech through text-to-speech software. All this happens in a matter of seconds, to get the information you need about the status of your order.

Conversational AI will create a real and personal relationship between humans and technology

As our world becomes more digital, conversational AI can enable seamless communication between humans and machines, with interactions that are an integral part of daily life. Besides improved user engagement, conversational assistants allow round-the-clock business accessibility and reduce manual errors in sharing information. They reduce the dependency on people for multi-lingual support and enable inclusion by removing literacy barriers. The benefits and potential of conversational AI are inviting businesses and technology to make heavy investments in the space.

Sales, service and support have been early adopters of conversational AI, because of the structured nature of information exchange that these functions require. This has decreased query resolution times, reduced the dependence on human agents and provided the opportunity for 24/7 sales and service. The AI chatbots are even able to deliver recommendations on purchases based on personalized customer preferences. According to Gartner, chatbots and conversational agents will raise and resolve a billion service tickets by 2030.

Across sectors, conversational AI is transforming interactions between people and systems. The banking sector is banking on conversational AI to provide a superior experience through transactions such as providing balance information, paying bills, marketing offers and products and so on, all without human intervention. The insurance sector is using chatbots to help customers choose a policy, submit documents, handle customer queries, renew policies and more. The healthcare sector is using these chatbots to check patient symptoms, schedule appointments, maintain patients’ medical data, and share medication and routine check-up reminders. Automobiles are becoming a cockpit for personal AI assistants or in-car experiences.

Businesses are also using conversational AI to manage their own workforce and improve the employee experience. Through chatbots, they make vital information available to employees 24/7, reducing the need for human resources to manage queries and processes. The possibilities and opportunities with conversational AI are endless and use cases are available in every industry.

Overcoming user frustration with Conversational AI through better engineering and design

While there are several benefits to conversational AI, you might be familiar with many instances when the conversation ends in frustration. As AI technology evolves and matures, these challenges must be addressed at the design and engineering stage.

In terms of design, the success of the platform entirely hinges on user interface and experience. It must be easy to use, intuitive, and must fit seamlessly into the overall design of the application and customer journey. While UI is important, the conversation itself is the most critical aspect. It is important to ensure that the conversational design flows smoothly, follows well-tested and widely applicable patterns and has exception rules inbuilt into the script design.

The more human-like the conversation is, the better the user’s acceptance

  • Draw from real life – To design a fulfilling conversation, architects and UX designers must draw from real-life, and UX design principles. The product has to be designed for ease of use, ease of conversation and ease of resolution. The product has to be easily findable, accessible and usable to the user in the overall product ecosystem. This can be achieved by following time-tested UI and UX principles in developing visual or auditory experiences.
  • Build trust – To build trust in conversational AI, small talk or playful ways to engage with the AI can be built into the engagement.
  • Understand the target audience – Understanding the target audience and their needs is pivotal to the success of conversational AI. An in-depth study of the demographics helps in building a platform that is unbiased. Incorporating languages, accents and cultural nuances allows the user to relate better and enable smoother interactions.
  • Solve customer problems, not business problems – A deep understanding of the customer ensures that the conversation design is solving for the customer, rather than solving for the business problem. When the focus is on the business problem, the is a possibility of ignoring the human-like flow of interaction. Putting the customer first helps in building a valuable and desirable interface that is a win-win for both the customer and the business. It is also important to ask what the system will help resolve and design the conversation to ensure the most frequent use cases for the application are solved logically and seamlessly. Bad AI chatbot conversationExample of a bad AI chatbot interaction
  • Recover from lagging conversations – The AI bot must also have the ability to learn from mistakes, recover from broken conversations and redirect to human agents when conversations cannot be fulfilled through AI. This has to be designed seamlessly into the interface, ensuring the customers trust the system and come back to use it in the future.

Engineering can help provide human-like interaction

  • The systems have to be able to deal with noisy settings and decipher languages, dialects, accents, sarcasm, and slang that could influence intent in the conversation. Intense data training, larger varied datasets, language training and machine learning (ML) could solve these challenges as the technology matures.
  • Another concern with conversational AI is data privacy and protection. To gain user trust, security must be paramount and all regional privacy laws must be adhered to.
  • Backend integration of conversational AI platforms may decide their success or failure in the market. The platform must integrate with CRM, after-sales, ticketing, databases, analytics systems and so on, to get appropriate data for the user, and provide appropriate data to the business.
  • Finally, the AI system should be backed by analytics and data, so that data scientists have invaluable insights to continuously improve the system.

Conversational AI is growing at an incredible pace and at a massive scale. This is because of the immense possibility that conversational AI has to bridge the gap between humans and technology. There is vast demand also due to the efficiencies and cost savings that conversational AI can offer businesses with quick, accurate and effortless query resolution. Businesses across industries should leverage this technology of the future to deliver a consistent and superior user experience.

Read More
AI & Automation

Will Chatbots Replace Traditional Apps for Brands?

Last year, the marketing promotion for the movie Insidious: Chapter 3 included a chatbot where fans could talk on the Kik app with a bot version of a character from the film.

Read More