Is it Better to Build or Buy a Voice App Solution? Ask Yourself 3 Key Questions

August 15, 2018

Amazon Alexa Voice Application Device

The incredible adoption of voice activated devices is driving brands to rapidly embark on a journey towards building engaging voice applications. In addition to identifying a voice-first use case to build, brands must also decide which tools and techniques their team should use to achieve this vision. In our evolution over the past seven years of building conversational experiences, we’ve seen it all, and want to share our key learnings and considerations when beginning your journey with voice.

Voice App Development Challenges

Organizations early in the voice journey have had to identify and rely upon highly technical skill sets. These skills often include AI/Machine Learning scientists, conversational UI designers, API designers, database administrators, and DevOps engineers. This approach has proven to be expensive, taken longer than expected, and once deployed, maintenance and improvements to drive engagement and success continue to require high investments.

It’s common to see a team working with a variety of tools and frameworks, ranging from technical teams writing custom code in languages such as Python or Node.js, to creative professionals designing conversations and scripting lines using spreadsheets! While a creative use for spreadsheets, it turns out they really are optimized for financial calculations and forecasts, not dialogue writing and design...

These hurdles are the result of inefficient approaches to new challenges presented by building great, modern voice experiences. There’s a better and more efficient way to build the most engaging voice apps. Here are the considerations and learnings we’ve encountered in our own journey that have led to some of the most exciting voice apps in market such as HBO’s WestWorld, Freemantle’s Match Game, and many others.

Top 3 determining factors for building voice apps from scratch or leveraging a solution such as PullString Converse.

1. Do you have the resources available to build from the ground-up?

When building voice apps from the ground-up, organizations will need to ensure they have the capability to:

  • Identify the technical resources required to build your voice app
  • Schedule available technical resources to adequately staff the project
  • Train your programmers on the Voice Platform SDK (such as the Amazon Alexa Skills Kit) and conversational AI concepts such as ASR, intent recognition, slot filling, etc
  • Train your programmers on Cloud Services (such as Amazon AWS) and to write code for them (such as Lambda functions)
  • Create and maintain a process for collaboration between technical resources and non-technical content creators
  • Deploy, monitor, and maintain your voice application’s infrastructure
  • Write the Alexa Skill code, including all logic for managing the conversational state and dialogue flow
  • Test, iterate, re-deploy, and repeat as needed, based on user feedback
  • Extract lines of dialogue contextual to user input from source code if other divisions of your company need to review the content of what your skill says to users
  • Publish code, intents, and Alexa skill
  • Monitor skill performance and experience
  • Analyze data and iterate with technical team

For important voice apps, you’ll need several developers to perform the work. You will also need a project manager, creatives, hardware, software, and computing resources. If you have an internal team (or agency partner) and the capabilities to handle this already, that's great! With the right voice-centric tools and systems, you will find your team can become highly productive.

Alternatively, your organization could consider leveraging an existing purpose-built solution such as PullString Converse to build your voice apps with some of those same team members or agency partners. Converse provides cross-functional teams of creative and technical users with an end-to-end solution for designing, prototyping, and publishing voice apps. Visual design flows empower designers and creatives to craft the most engaging conversations, while an integrated environment for developers allows them to focus on integrations with existing systems and APIs.  Most importantly, our enterprise-class conversation AI engine and deep integration to voice platforms frees your team from maintaining boilerplate code with its accompanying infrastructure and allows them to focus on building the best possible voice experience for your brand.

2. What tools should these resources use?

Assuming you’ve got the talent inside your organization to begin building compelling voice apps, the next series of efforts will focus on the user experience of interacting with your voice app. From a design perspective, a variety of tools have been repurposed to help define user flows and interaction models. These tools include Microsoft Visio, Excel, PowerPoint, or design products from Adobe, InVision, or Sketch. None of these tools directly integrate with the voice platforms being designed for, so there are several steps required to go from a concept to an actual experience one can interact with. This introduces some additional friction and delay in building the most compelling solution.

Alternatively, teams might consider PullString’s visual authoring environment, Converse, which allows creative professionals and programmers to work together. Converse empowers creative professionals and non-technical users to visually design a natural and enjoyable user experience, and directly test and prototype that experience on-device or in-browser. This enables teams to rapidly iterate through the design process and build a richer, deeper experience that your customers will enjoy using. This approach also affords developers the opportunity to focus on the unique business logic of the voice app instead of having to also focus on building a conversational AI engine from scratch and maintaining the voice platform integration.

3. How much time can you afford to get your voice app to market?

In our journey with organizations over the past number of years, we’ve seen the value in quickly placing experiences in market to test, validate, and learn. As more and more brands aggressively enter the market, deploying quickly and iterating constantly to evolve your voice app experiences is increasingly important.  We call this the “crawl, walk, run” approach. In other words, quickly getting to market with a “crawl” voice app is better than slowly trying to perfect a “run” experience.

As we’ve covered above, building voice apps from the ground up requires developers to become familiar with and implement the various platform components (e.g. SDKs, and APIs, deploying a set of  services on platforms like AWS), write boilerplate conversational AI code to handle things like state management, defining slot types, interaction models, content handling, and then record and upload media assets (like sound effect or custom audio files). That takes a lot of time and resources, and if you're not careful, you can end up going over budget and past deadlines dealing with this overhead if you don't have experience building voice applications already.

Here are some examples of leading brands that have used PullString software to build highly engaging voice apps. They looked at how much it would cost to onboard and train their developers, find the development time, and the costs of using computing resources, and determined that PullString’s purpose-built software platform provided the best approach to building rich, engaging experiences.

PullString Converse provides creative professionals with the ability to focus on conversational elements and go further in voice design before developers help with integration. Converse automatically handles the complexity of artificial intelligence, state management, conversational techniques and much more to enable the most intelligent and high-fidelity voice experiences. In addition, PullString Converse is powered by Conversation Cloud, a scalable and secure conversational Artificial Intelligence (AI) platform to power voice applications on any device.

To learn more about how PullString voice technology software can enable your team to collaboratively design, prototype and publish voice applications, please contact us for a complimentary discovery call and demo of Converse.

Related Voice Apps Posts:

Written by Patrick Ku

As Head of Solutions, Patrick works with all PullString customers to make their computer conversation dreams a concrete reality. Starting as a software engineer on the PullString platform, Patrick is excited to bring technical product knowledge and apply it to specific customer needs throughout the sales and onboarding process. Before PullString, Patrick spent nine years at DreamWorks Animation as a Technical Director bridging the gap between art and technology. He earned his Computer Science degree with a 3D Animation minor from the University of Southern California.

Recent Posts