Useful Tips

Smartphone voice control

There are many different solutions for voice control of smart phone functions, but not all of them are implemented with sufficient quality. We have selected those that really work.

Voice control when working with modern smartphones and communicators equipped with sufficiently powerful CPUs is an established trend in creating convenient user interfaces. To varying degrees, it is possible on all major mobile platforms. In iOS it appeared in version 3.0 (fully functional starting from 4.0), in Google Android - in version 1.6 (fully functional since 2.2). This feature is relatively well implemented in Windows Mobile and S60. We have selected several solutions that can replace the standard voice control modules, as well as software for expanding functionality.

Main characteristics of speech recognition systems

Over the past two or three years, speech control has been considered one of the most promising technologies used in the creation of user interfaces. This is stated by the leaders of Microsoft, and representatives of Google and Apple are showing noticeable interest.

Indeed, control of a mobile communication device by pressing buttons already seems archaism. Touchscreens and voice are marketed as natural ways for humans and smart devices to interact. An important characteristic of such systems is the correct recognition of commands. If everything is more or less clear with touch control (modern smartphones even support control using complex multitouch gestures), then things are not so simple with voice commands.

First, the system may not always respond correctly to how commands are pronounced. You will have to adapt to such control, which is not always convenient: it is very tiring to keep track of the timbre of the voice and intonations all the time. In this case, the commands must be separated from the general background noise, which requires computational resources.

Secondly, such a system does not turn on automatically - to activate it, as a rule, you need to press a button on a device or accessory (for example, a wireless headset). Software inclusion is not always convenient. On communicators with Windows Mobile with Broadcomm software stack, activation of Microsoft Voice Commander from a Bluetooth headset may function unstable or not work at all.

Thirdly, voice control is not yet able to correct user inaccuracies and errors. For example, if you try to start playing a song of a group whose title contains the article "the" without mentioning it, then in most cases the device will not understand such a command. Difficulties also arise when dialing namesakes and namesakes from a notebook - for correct operation, you need to fill in the "nickname" field and assign an additional launch command.

Fourth, for the constant use of voice dialing (for example, when writing an SMS), the processor of the mobile device launches quite resource-intensive modules of the recognition system. That does not have the best effect on the performance and battery life of the communicator. However, now this problem is gradually being solved.

Vlingo is a cross-platform voice control module for working with third-party software

Speereo Voice Launcher offers a rich set of functions for voice control and even understands not very clear pronunciation

Management of standard system functions and voice search.

In all popular mobile operating systems, to one degree or another, the possibility of voice recognition of commands for launching typical applications is implemented. For example, dialing a number from a notebook, opening an email client, or starting a playlist.In addition, these modules can sound system processes, informing that the phone is running low or switched to silent alert mode. None of the programs are able to execute more complex commands (for example, “open an email client, write a letter to Mr. Ivanov and mark all messages in the Inbox as read after sending it”). However, they are gradually developing. So, if you ask an iPhone based on iOS4 what time it is now, the system time will be announced. In addition, the same voice program of this operating system understands the negative responses of the user: "no", "wrong", "wrong", etc. In other mobile systems, instead of them, you have to resort to touch control.

On classic WM-devices, two packages are used for voice control - Cyberon Voice Commander and Microsoft Voice Command. However, you won't be able to use them at the same time - you have to choose one.

The first requires some training to recognize the commands, although the list is not very large. The program can call contacts, Calendar entries, run all standard and some third-party applications and play music, as well as read incoming messages. The second package additionally controls the volume, the mode of operation of wireless connections, and also sounds system events. Microsoft also recently had an interesting TellMe product for advanced voice control. It is able to launch the Bing search client with a dictated request for information, talk about stock prices, sports results, weather, movies and traffic conditions. But for all this, the device must be connected to the Internet and be in the field of view of GPS satellites. It is these tools that are used to calculate the location. In addition, this service is not available in Russian.

In iOS and Android above version 2.2 of FroYo, the built-in voice dialing systems are approximately the same, with the exception that Google's product has the ability to route routes using maps to the location of the office of a given company or a specific point. In Symbian OS 5th Edition, voice control is responsible only for executing the standard functions of the system, and for voice search, you will need to install separate software - for example, Google Mobile App.

Voice control of additional functions and launching third-party programs

Of course, voice tools should not only partially facilitate everyday work with the communicator, but completely take over the performance of everyday activities. Moreover, not only with standard programs, but also additionally installed by the user. For these purposes, you can use separate products - for example, Speereo Voice Launcher. This program is compatible with Symbian OS (including S60), Windows Mobile, and in the future with Android OS. It is a compact shell that allows you to schedule the launch of any applications and files and transition to any web pages in the browser.

The product does not depend very much on the characteristics of the device owner's voice: the recognition engine is able to detect commands pronounced with an accent or minor diction defects. Integration with standard programs (notebook, organizer, quick message client) is provided, but there is no transfer of bookmarks from Favorites. The definition of commands to launch is carried out through the application settings. The user writes the name of the command in Russian in Latin or in one of the supported languages ​​(English, German, French, etc.), after which it is entered into the database. Interestingly, Speereo picks up commands even in noisy environments.

For Google Android versions below 2.2, there are three applications that replace the Voice Actions application launcher service that appeared in Android OS FroYo. Firstly, these are programs Edwin and Vlingo, which work only with English.

The first is an advanced voice command recognition client that provides not only Google searches, but also finding mathematical formulas in Wtolfram Alpha, sending messages to Twitter, etc.

The second client (runs on iOS, WM, S60 and RIM BlackBerry platforms) has the same features as Microsoft's TellMe. As well as the ability to send statuses to social networks, search for routes and contact information about companies in the area. Finally, there is TopVoiceControl for Android communicators. In addition to the usual dialing of numbers from the address book and recognition of spoken numbers, it can control wireless interfaces and open the calendar.

To-do list

Voice organizers are still exotic, but the first applications of this kind are already appearing and gaining some popularity. So, the aforementioned developer Speereo Software offers the Speereo Voice Organizer program, designed to create entries in the "Calendar" and "Tasks", emails. However, in this case, the voice is not converted to text. The message is sent as an attached audio file and alerts about current tasks. IOS includes the QuickVoice2Text Email mail client, which recognizes dictated messages and translates them into text format.

For Google Android, a voice application Taskos To Do List has been released for adding tasks to a to-do list and a program for sending SMS, letters and messages to Twitter called VoiceLink.

Taskos To Do List Make a to-do list by dictating them to your Android OS device

Historical reference

The first speech recognition technologies appeared in 1952 and made it possible to automatically detect the spoken numbers. By the early 1990s, solutions appeared on the market that could handle single words and phrases, as well as simple sentences. They were common in the United States and used by the medical and military. The popularization of voice control systems among ordinary consumers began only at the turn of the 20th and 21st centuries - with the advent of smartphones.

$config[zx-auto] not found$config[zx-overlay] not found