I just finished measuring the accuracy of my time series of DNS 13 user profile backups going back to the release of Dragon NaturallySpeaking version 13. I have a standard recording of colloquial dictation of approximately 4000 words or just under 20 minutes. I made this recording several years ago with theBoomO microphone and I've used this recording as my standardized testing for Dragon accuracy.
I can find no evidence of DNS 13 profile degradation over time.
The most current profile that I measured has been Accuracy Tuning optimized 12 times with three different primary microphones and three additional intermittently used microphones. None of these microphones are theboomO.
All of these individual profiles are derived from a single original profile of BestMatch V Large Vocabulary, where I had individually trained each of my 2500 custom words plus analyzed approximately 1 million words of my technical writing, emails and forum posts for adapting to my writing style.
I also incrementally analyze my sent emails every month, forum posts, plus analyze any new technical documents based on my project workload. I individually train each new technobabble custom word. This is necessary to have any kind of reasonable accuracy with my technobabble dictation since particularly with cloud computing, the technical and product vocabulary is changing very rapidly.
I save and export these profiles to my network server with annotations about the profile after each Accuracy Tuning to easily provide comparison measurements.
To emphasize, this measurement is against colloquial and not technical dictation. Using Rüdiger's accuracy measuring tool, the variation in accuracy across all of these profiles is only 0.6%! The average accuracy is 98.7%
Note, all the machines that I use have SSDs, the profiles are exported and stored on my network server, I never let my computers sleep, I never let my computers hibernate, I make sure the USB ports are powered at all times and I shut down each computer (power off) after saving my user profile.
The text of the test recording has NEVER been used before in any of these profiles. To run these tests, I pull the profiles off of my network backup server. Once I complete a measurement, I then delete that profile from my Dragon instance on that workstation so there is no textual contamination.
Thanks so much Phil for doing all the hard work in conducting the kind of testing which meets all the criteria to retrieve such valuable statistical evidence, and for providing us with them, specifically at this time of day over where you are, in the San Francisco Bay area that is.
Being somewhat familiar with such type of analysis, I am fully aware of the enormous amount of perseverance it usually takes, not to start all this, but to keep it up until it is really finished. Therefore kudos to you.
However, while you're mentioning no degradation of accuracy over time and along with optimizing regularly, one of the things that most uses fear or seem to encounter anecdotally, what about any increase in accuracy gained from optimizing? Would you say that optimizing is a must or an option rather?
As far as I'm concerned, I don't optimize regularly to tell the truth, probably just out of habit, and if I do I couldn't really say whether the overall impact is positive or negative, but that's just because I don't do all the monitoring as meticulously as you do.
It has always been a big pleasure working with you.
Rüdiger
_______________________________________
Dragon Professional 16 auf Windows 10 Pro und Windows 11 SpeechMike Premium (LFH3500); Office 2019 Pro + Office 365 (monatliches Abo) HP ZBook Fury 17 G8 - i7-11800H - 24 MB SmartCache - 32 GB RAM - 1 TB SSD
Thank you for the kind words Rüdiger, this analysis would be impossible without your tool.
with respect to optimization, the only increase in accuracy for colloquial speech is after the first optimization. That first optimization was run after the 1 million word document analysis and the pronunciation of all of my custom words, followed by a week of usage. Further optimizations make only small random changes in the accuracy of colloquial speech.
Optimization is critical to me for keeping my technical recognition rates in 97% plus range. I believe there are two reasons for this. First, with respect to custom words and writing style, the Dragon NaturallySpeaking program has a terrible "Memory" for unused customizations, metaphorically speaking.
As an example, when Dragon NaturallySpeaking 13 first came out, I was doing a lot of work around the new API Management Gateway technologies and my recognition after the first optimization jumped several percentage points. My recognition rate stayed at a very high recognition rate for several months because I was dictating about the API Management topic every day in both formal documents as well as emails.
In late December I was focusing my work around Hybrid Cloud. At first, the accuracy rates were pretty bad so in order to get high rates of accuracy, I needed to optimize after both analyzing new documents, adding/training new technical vocabulary as well is dictating reports/designs regarding Hybrid Cloud. However, this first optimization did not give me as high a recognition rate as I had been getting with my API Management Gateway documents.
My second point is that it takes a large amount of document analysis text in order to get an optimized language model that will be highly accurate for very unusual technobabble phraseology. The evolving terminology around Hybrid Cloud, Containers, Micro Services and Cloud Orchestration uses intentionally unusual naming and usage vocabulary to distinguish it from other traditional IT technologies. It took two months and two more optimizations before my recognition rate for Hybrid Cloud documents stopped improving. Because of the rapidly evolving terminology the recognition rate for Hybrid Cloud is about 1% below my other technical terminology. That's why I analyze my documents and emails monthly.
Now back to my first point, I just recently, needed to do another API Management Gateway design for a new client. After six months of elapsed time, not using this vocabulary, my recognition rates had dropped off significantly. I reanalyzed all of my old API Management documents and while that helped I still got an additional 1.5% accuracy improvement after an optimization following the dictation of a rather lengthy API Management Gateway document.
My conclusion is that for those who either dictate only colloquial speech or who do not have a rapidly evolving or changing usage in their custom vocabulary, optimization is not necessary. Document analysis and adapting to writing style has the biggest effect. I noticed the strong document analysis affect with my email replies. Since many of my email replies (a few hundred per day) are relatively short and stylized, with my monthly incremental email analysis, my email recognition rate is 100% for almost all of my short emails.
Phil, thanks for mentioning my tools, but tools are only as good as the way they are used and I can't help feeling that you're about the only person who understood the way they were meant to be used.
Well, thank you so much for the update and expanding on all this information which is extremely useful and just too important to let it slip through, so whenever we get close to rounding this up, we should definitely provide a German summary of the discussion, for good measure, and I wouldn't mind someone volunteering to help us in this.
I would just like to comment on a few points you have made which are all pretty much in line with my day-to-day dictating experience.
First, it looks as if Dragon's memory on unused customizations isn't only horrible, but conspicuously short-lived, ephemeral at best, or nonexistent if in doubt.
Second, no matter the amount of customization the user would apply, there is no way of overriding the built in, factory provided models we are left to use in the long run.
If I were to hazard a guess, I would imagine that it has been implemented this way on purpose, not making them too flexible, but robust against rendering them unusable too easily.
Rüdiger
_______________________________________
Dragon Professional 16 auf Windows 10 Pro und Windows 11 SpeechMike Premium (LFH3500); Office 2019 Pro + Office 365 (monatliches Abo) HP ZBook Fury 17 G8 - i7-11800H - 24 MB SmartCache - 32 GB RAM - 1 TB SSD
Rüdiger, I think you're right that Nuance has put in functionality to keep the underlying language model from getting too far from the core language model. For example, when you did the tests with Dragon Medical it was clear that Dragon Medical provided a higher recognition rate no matter how much Dragon Professional was trained on medical terminology.
In my case, when I return to a previous IT product area, I can't practically go back to an earlier user profile because all of the other associated product names and technologies wrapped around my design continually change as do the client team members. What I've done, to get around the Dragon forgetfulness, is categorize my analysis documents by general topic to allow for reanalysis of those documents on top of my most current user profile and vocabulary. Even then I still need to hand correct product version numbers in order to get the right strange IT product capitalizations (I dictate the names as a single phrase including the version number and this gives me better recognition)
However, where someone's usage pattern results in custom profiles that are stable over time, it might be reasonable to achieve high recognition rates without having to reanalyze if that user kept multiple custom profiles per topic or project. I was just thinking of the person who works with historical German language vocabulary as opposed to contemporary German vocabulary.
if someone wants to translate my two posts into German I'll be happy to answer any questions to provide clarification on the translation.
Yes, quite so. To use an analogy, you might think of a user profile as a wooden rod where you may bend the ends to some extent, and keep them bent as long as you hold onto it, but which will always take on its original shape as soon as you let go of it.
To me this makes sense perfectly, because if the user area (for lack of better term) was given too much space, the core functionality might soon be compromised thus rendering the profile pretty much useless.
However, the user does have the chance to adapt the profile to his specific needs, but a profile once adapted appropriately had better be left as is.
Once again, thank you very much for providing us with all this information, as well as covering some very important and essential key points, not only in theory, but most importantly as far as the practical implications emanating from there.
_______________________________________
Dragon Professional 16 auf Windows 10 Pro und Windows 11 SpeechMike Premium (LFH3500); Office 2019 Pro + Office 365 (monatliches Abo) HP ZBook Fury 17 G8 - i7-11800H - 24 MB SmartCache - 32 GB RAM - 1 TB SSD