Registration is wrong! Please check form fields.
Publication detail:

Speech technologies market trends shift: from hype to must-have

smm_____.jpgAccording to Gartner, AI is almost a definition of hype. Speech recognition, nevertheless, is in the plateau of productivity and NLP/NLU-powered solutions are already moving from hype to the must-have status globally. The speech and voice recognition market was valued at USD 5.15 bn in 2016 and is expected to reach USD 18.30 bn by 2023, at a CAGR of 19.80%, based on some predictions. By 2020, 50% of analytic queries will be generated using searches based on natural-language processing, according to Gartner, and 20-25% of searches made with the Google Android App in the US are already voice searches, according to Google.

There are some must-haves for vendors focused on reshaping customer interactions when they are discussing solutions with their clients. At Spitch AG, we know from practical experience how critical it is to translate tech fashion into productive realities and tangible business benefits.

We always check the industry landscape and tailor the solution accordingly to customer needs and best practices. Thanks to our customers and team, Spitch always adheres to all the principles listed below, hence is involved only in successful projects. There is, though, quite a number of occasions in general where reasons for failures are not carefully analysed and mistakes are repeated. This leads to customer frustration and longer adoption time.

In the following paragraphs, I would like to highlight and comment on several important topics to keep in mind for a successful outcome of speech enabled projects.

Use case definition. There are many possible scenarios of speech tech implementation within each industry. It is imperative to start with one specific use case and add along the way; this would help avoid losing the focus and client interest. Moreover, it is essential to concentrate efforts to obtain results in a reasonable time.

This was exactly our strategy in the case of Swisscard — one of the leading Swiss card products providers, which has just received the "Golden Headset Award" for the most innovative contact centre solution in Switzerland. The intelligent voice-driven IVR solution is based on Spitch's AI/NLP technologies and can automatically route calls to the right agents. It also provides agents with the transcript of keywords in a text pop-up window to help them understand the call topic and key issues at a glance before they answer the call. This has been the initial specific use case that matched one of the customer’s needs. Next step would be adding speech analytics to enable customer experience measurement and improvement as well as automatic/semi-automatic call handling, but a clear use case definition from the very beginning was a crucial success factor in this project.

In another case with one of the largest contact centres in Italy that needed to perform automatic compliance checks, precise use case definition helped us deliver a simple and elegant solution. A specific tool was integrated, which was able to check in real time all the conversations against the script uploaded into the tool, so that the supervisor could see any discrepancy immediately. The same tools are relevant for ensuring regulatory and conversational script compliance in other industries, such as telecom and banking, especially under the MiFID II rules.

Agile deployment. In addition to clearly defining a specific use case in the beginning, it is also important to remain flexible throughout the project as we are introducing a new innovation paradigm. Hence the need for an agile deployment, considering that the new path is being learnt “on the fly”. In many projects, especially in the speech analytics domain, our customers, who have a lot of voice data to analyse for improving customer experience (CX), admit they have no clue how to do it practically and what possible output could be expected. So, we propose an agile approach, by collecting, processing and analysing the data with each sprint based on the previous sprints’ results.

Data and Security. Client audio data availability is normally restricted, and some data is confidential because it contains personal information of customers or for other reasons. Automatic sensitive data redaction and on-premise delivery ensure personal data privacy in line with GDPR. Syntec — a UK network operator providing a full range of telecom services to clients all over the world — is using Spitch’s solutions on premises to enable voice input for credit card numbers in accordance with PCI DSS compliance terms. Compliance and Security are key, but the customer experience is also driven by the accuracy of the automated solution. In the Syntec case for example, previously tested out-of-the-box solutions did not provide the expected level of accuracy. From the start, Spitch has delivered credit card numbers recognition accuracy of 92% and it is still growing. Customers could pronounce credit card numbers in any sequence without an impact on accuracy. Solution language models were improved in just 3 weeks thanks to Spitch Lingware Suite tools. As a result, Spitch has achieved seamless integration with partners’ solutions and payment systems leading to cost reduction in reliable PCI/DSS compliance environment.

Customer experience. Voice-driven solutions should be user-friendly, simple in implementation and maintenance and deliver results faster. They should also be fun to use, and obviously secure and trustworthy. Otherwise customers do not receive them well. People usually have no issues talking with bots, but they do it in a different way compared to the conversation between human beings. It is very important, therefore, to consider this difference and make robot say things (including a simple introduction and greeting) matching that customers perception — this will drive adoption and CX.

In one implementation for example, Spitch’s bot was handling automated soft collection calls with the purpose of obtaining a promise to pay and succeeded in achieving results comparable of those received through an expert human communicator, who is not always available.

In the most challenging case of BPS-Sberbank, training the system on dialogues between a customer and the contact centre bot, instead of customer-agent conversations, lead to 2.2 times accuracy improvement, greatly increasing customer experience.

A conversational bot that deals well with human communication is vital; in many cases in Italy, the bot will entertain people that enjoy talking with the machine, and an attractive dialogue flow is instrumental to this purpose.

Another interesting case is that of AMAG — the largest car dealer in Switzerland. In this case, Spitch solution asks callers for their vehicles’ VINs, car brand and model and, after cross-checking the data, routes the call straight to the right agent. Such a simple automation helps reduce call processing time by approximately 15 seconds delivering tangible cost-saving immediately and improving customer experience.

Pragmatic approach. It is also critical to clearly understand the real benefits customers expect when using speech technologies — be it cost savings, CX improvement, both or others — it should not be driven by curiosity alone. Clear business KPIs (and not just technical accuracy) should be set. Providing transparent and easy ROI calculation is paramount to measurable cost savings. Our implementations with Swiss cantonal banks may serve as a good example of the customers’ intention to achieve both — cost savings and CX improvement, which makes it affordable even for small-sized entities. Banking executives see clear benefits in automating certain routine procedures, e.g. debit card balance requests handling, using voice technologies. Use cases like that with clearly defined criteria of success, if properly implemented, lead to easily measurable cost reduction and CX improvement. A recent survey by Spitch during our webinar for bankers, representing banks of all sizes, showed that over 76% of attendees were either evaluating or planning to offer a voice service soon. 57% of the bankers saw a very high potential in voice biometry. 73% — would opt to use an automated voice response system instead of waiting for an agent response, even if there is only a very short waiting time.

Proper integration into the existing SW and HW landscape. Many projects go bust due to the lack of proper definition of all the key components in the very beginning. Overlooked items dramatically influence the project time-line and costs, often leading to a declining client’s interest. On one hand, it is critical to have a good connection to the system integration part; on the other hand, we would also leverage our practical experience coming from deployments in the same environment and systems. For instance, the partnership and effective collaboration with Swisscom has helped Spitch successfully complete not only the Swiss credit card company project mentioned above, but also other projects. Spitch and Swisscom offered one of a Swiss retail bank an approach to optimize video-banking solution, by adding voice biometrics as one of the verification factors and reducing the time for ID verifications. Another good example is our partnership with Avaloq: Spitch’s solution has been integrated with Avaloq banking software to automate processes where spoken language was used. Avaloq is now able to offer its clients reliable speech technologies solutions integrated into its own core banking environment, as well as remote self-services in the future.

In some cases, added flexibility is required to deal with unexpected technical challenges in the middle of the project, e.g. in our case with one Swiss bank that implemented voice biometrics, the audio available for model training was compressed, which made it unsuitable for the purpose. In this case, however, we have managed to collect data from the bank’s customers quickly enough using proprietary online tools.

Time to market. Deep integration of bespoke NLP/NLU solutions into clients’ systems is not the only challenge. Sometimes, additional training is needed due to restricted domain needs. When customer specific data are difficult to retrieve or simply it takes too long to obtain, smart crowdsourcing can help collect data for training allowing to reduce the time to market.

These instruments are part of the Spitch Lingware Suite — that we encourage our clients and partners to use in order to save costs, reduce time to market, or simply fine-tune, or develop their own speech solutions, e.g. for new industries. The use of Spitch Lingware Suite instruments allowed our partners like Dotvocal, AdNovum, BSS and many others to deliver well-trained and accurate solutions to end customers, even within the very specific domains, in just few weeks. Adding a voice-control component to a Swiss Railways (SBB) app, for example, took just 9 weeks — starting from scratch to a top-notch quality solution understanding more than 25000 station names in different Swiss German dialects with an unprecedented level of accuracy.

To summarise the above success factors: the winning solutions in 2018 were those that have enabled truly engaged human-to-human live interactions augmented with AI/NLU functions, that have added accurate and reliable voice-driven services, robotising routine operations, and, therefore, re-configuring customer experience and self-services. According to McKinsey, that appears to be one of the dynamics shifting the industry.

Getting voice technologies right means making wise choices and starting with the simple solutions that make it easier for customers to access services. Moving from something simple to a greater complexity, higher level of RPA (Robotic Process Automation) and, eventually, comprehensive platform deployments is a journey where clients should be able to easily control risk and appreciate the benefits. The key point is to align customer care strategies with the acquisition of future-oriented solutions, as opposed to legacy software that will keep you lagging behind.

Rapid change is underway in the speech technologies market. Let’s try to draw some predictive value from the analysis of the current trends and already existing business trajectories:

Omni-channel communication platform. Cost saving and CX improvement are very important. But a step ahead there is to also increase sales, not only converting voice into action, but also handling natural conversations with customers, regardless of where it takes place. Spitch's omni-channel communications platform based on a conversational user interface, allows customers of our clients and partners to really have their voices heard via any channel they prefer, at any time and handle comprehensive conversations fully automatically. An AI/NLU-driven conversational platform is fully utilizing the growing potential of deep neural networks, intent matching, semantic interpretation and emotion detection. It is one of the most likely directions where the contact centre industry will be heading in 2019-2020.

Shared biometry. Implementing bespoke solutions based on voice biometrics for customer identification and authentication purposes is not an easy task, especially for smaller-size clients. Implementations require a fairly long time due to the need to collect correct voiceprints, setting the right balances of false acceptance and false rejection rates in line with the security requirements and regulatory frameworks, among others. At the same time, cloud-based platform solutions sharing the voiceprints and delivering the authentication as a service may help save a tremendous amount of efforts and money. Considering the push towards creation of national biometric identification systems in some countries for accessing various state and business services, such as e.g. Russian biometric platform, it is clear that those businesses that are currently experimenting or have already implemented voice biometrics will be among the future winners.

Externally managed omni-channel interface services. Voice user interfaces will be increasingly offered as a service by external providers, e.g. Swisscom’s managed services, making it possible for customers of any size to experience the advantages of the entire communication platform as a trusted provider’s service relatively fast, cheap and easy. This approach will ensure that such services are more affordable to a wide range of smaller businesses.

Voice data becoming central for customer experience improvement. Voice data is becoming an important part of the big data analytics industry and RegTech (thanks to MiFID II and GDPR, among others) with a potential to eventually improve sales through further personalisation. The smart use of voice data will stand behind improved customer retention and, generally, customer experience management techniques.

Spitch AG has developed the entire spectrum of voice technologies to eventually enable a machine to understand free speech as a human. However, staying on top of the above trends with the corresponding product offering is not enough for success. It is crucial to know the client’s business inside-out and be capable of offering solutions that guarantee fast time to market and meaningful business effects, while being simple and easy to use. Spitch’s product division is focused on making it happen, and we are building mutually beneficial partnerships to capitalise on the comparative advantages. Our teams are keen to collect, exchange and share the golden grains of practical experience in each new project implementation that would bring additional value.

Apart from the business values, Spitch pays high attention to real customer experience, and the best example of such attention is Spitch’s unique experience in accurate recognition of dialects. The Head of Technology at Google, Urs Hölzle, stated in a recent interview with the Swiss newspaper Tagesanzeiger that a speech recognition for Swiss German is not possible. This is of course true for Google and maybe other IT giants, however we at Spitch are offering highly accurate speech recognition for German and Swiss German dialects, among others, and Spitch solutions are in productive use at multiple Swiss companies, as noted by my colleague Jürg Schleier, Country Manager DACH at Spitch AG in his recent LinkedIn post. The same know-how and approaches are replicated by Spitch in relation to other languages.

We at Spitch are always happy to share lessons learned and explore new partnerships — please do not hesitate to follow our LinkedIn corporate account where we post our own and repost the recent domain related news, including those commented on by our trusted experts. Acting together and openly sharing the lessons learned, when exploring the new routes, is definitely the right way to succeed.

Or just go to our website and start hiring your virtual customer care team members there — contact centre agent, supervisor or security officer, speaking Swiss or High German, English, French, Italian languages.


Download PDF (English).

Publications