First thank you to Rüdiger for making his Performance and Accuracy Measurement tools available to me.
These preliminary measurements were made on my i7 3960X machine which has a single thread Passmark rating of ~1984. I used the exact same same sample recordings that I have used to evaluate versions 11 and version 12.
Upgraded Profile Performance Testing My first test was to validate Olive's comments on upgraded profiles (which the measurements confirmed).
After installing Dragon NaturallySpeaking v13, I upgraded my existing v12.5 profile, which is two years old, has been optimized multiple times and used with six different microphones. I then tested the performance of this upgraded profile using Rüdiger's performance tool and my standard performance recordings (first the Rainbow Passage).
With the slider at 100%, the upgraded profile ran faster in Version 13 than it did in Version 12 with the slider at 50%. The Real Time Factor was improved by just under 20%. In version 13 there was no measurable difference in the Real Time Factor between the slider 50% and the slider at 100%. The individual accuracy scores improved
The most interesting result was that a brand-new, untrained, v13 profile set to BestMatch V, Large Vocabulary was not any faster than my v12 profile upgraded to v13 (within the likely margin of measurement error).
A quick test dictation of my technobabble custom words demonstrated that the version 12 to version 13 upgraded profile recognized my custom words at least as well as in version 12. Of course, custom technical dictation into the brand-new untrained profile was completely unacceptable (quantitative testing on custom technical vocabulary will come later).
Accuracy Testing I have a 3800 word recording of the Dogbert Management Handbook that was done at one straight through ~20 minute reading. I had this recording human transcribed and then reviewed and corrected by two other people, as well as myself, to ensure that the reference text is completely accurate relative to what was actually spoken. I have never trained either profile using this text. I used Rüdiger's accuracy tool to strictly measure the accuracy of version 13 versus version 12.
In version 12, the recognition accuracy did not change between using a brand-new untrained profile and my trained/optimized technical profile (although the specific misrecognitions did very). This is been my experience with colloquial dictation and my voice.
Using DNS13, with an untrained profile, the number of misrecognitions was reduced by approximately 15%. The trained v12 profile that was upgraded to version 13 showed no improvement accuracy and had exactly the same accuracy as is when used in version 12.
This was actually my expected result with the implication that I should re-create my technical user profile in version 13 for best accuracy. I will be doing more measurements over the next couple of days, including Ultrabooks. If anyone has suggestions for something that they would like measured quantitatively just reply to this thread.
Kudos to Olive and the rest of the Nuance team on the upgrade and my apologies for posting in English as my written German is worse than horrible.
This is great work, and very interesting results, Phil. Thanks so much for posting it here also.
Rüdiger
_______________________________________
Dragon Professional 16 auf Windows 10 Pro und Windows 11 SpeechMike Premium (LFH3500); Office 2019 Pro + Office 365 (monatliches Abo) HP ZBook Fury 17 G8 - i7-11800H - 24 MB SmartCache - 32 GB RAM - 1 TB SSD
If Phil's measurements are correct - which I have absolutely no reason to doubt - AND Olive's statements about what the profile upgrade process essentially does
are correct - which I have absolutely no reason to doubt either - one could draw the somewhat provocative conclusion that optimizing one's profile during normal use (outside the upgrade process) using the DRA-files is not such a good idea: It follows from Phil's measurements that an upgraded profile does not offer improved accuracy, whereas a freshly created profile does. Since it is hard to imagine that this drawback of an upgraded profile is caused by copying settings, custom words and property changes in the vocabulary, the most likely culprit should be the optimization process, which has of course not occurred in a freshly created, untrained profile.
However, while I have strictly followed the experts' advice to create a new profile for every subsequent version, my personal and anecdotal experience indicates that regular profile optimization is beneficial. Go figure.
Anyway: I am very grateful to both of them for sharing their respective expertise.
Meinhard, I have a different working hypothesis that will need additional testing. I probably wasn’t clear enough about the accuracy behavior of this exact same recording in version 12.
In version 12 with the Great Lakes accent there was no difference in accuracy between a
brand-new untrained profile optimized profile with trained custom words and writing style analyzed the two-year multiple optimized profile[/*]
That is, with my accent and a very good dictation style, the recognition of colloquial English in version 12 was unaffected by optimization.
A version 13 untrained profile reduced the misrecognitions as opposed to either an untrained or a optimized profile running in version 12. Plus the v12 to v13 migrated optimized profile did not show any reduction in misrecognitions.
There are two hypothesis but I think I can distinguish between them with additional measurements:
A migrated profile does not take full advantage of the improvements in recognition of the speech engine
With my residual Wisconsin accent a profile that is optimized on technical writing will reduce the accuracy for colloquial speech.[/*]
My primary use is running my Systems Integration business. As per my previous posts here on the forum, for me, optimization shows measurable incremental improvements in recognition of my technical prose (using Rüdiger's tools)). My accent when speaking colloquial words has a noticeable Wisconsin accent on playback. My pronunciation of technical terms has almost no trace of Wisconsin accent.
Liebe NUANCER, Dies Forum ist genial. Und der Administrator leistet wirklich tolle Arbeit. Leider kann ich nicht so gut englisch: verstehe ich richtig, dass neues Profil bei dns13besser ist, als dns12 Profil zu importieren? Gruss Neu-User Mathis Oberhof
DNS pro 13 (13.00.000.086) auch Win8.1 (64 Bit) Plantronics Audio 400 DSP Arcer Aspire V3-772G, Intel I7-4702Q, 2,2 GHz up to 3,2 GHz, 8 GB
Mathis, verwende ich die "Google Translate"-Funktion in der Chrome Web-Browser, um den Text auf diesem Forum vom Deutschen ins Englische zu übersetzen.
Meine kurze Antwort ist, empfehle ich Ihnen, eine neue DNS-13-Profil erstellen und nicht den Upgrade-Profil von DNS 12.
Ich habe "Google Translate" zu meiner ursprünglichen englischen Text ins Deutsche zu übersetzen.
Ich hoffe, dass die Ergebnisse nicht zu albern oder unverständlich.
Thanks, Phil, for taking the trouble to engage "Google Translate", where there are so many human translators around.
Indeed, the results are only slightly "albern", but really not "unverständlich" at all.
Best, Rüdiger
PS: once we get to results somewhat more stable, we will certainly provide a German summary, man-made kind.
PPS: after going back and reading up on your most recent update of the thread on KB, I can't help but stating that your efforts are invaluable, saying this on behalf of the entire community. Nuance should actually pay you for this.
_______________________________________
Dragon Professional 16 auf Windows 10 Pro und Windows 11 SpeechMike Premium (LFH3500); Office 2019 Pro + Office 365 (monatliches Abo) HP ZBook Fury 17 G8 - i7-11800H - 24 MB SmartCache - 32 GB RAM - 1 TB SSD
Nachdem ich ein neues Profil erstellt und die Erkennungsgenauigkeit entsprechend erhöht habe, habe ich ein deutlich besseres Ergebnis. Vielen Dank für den Tipp.
Zitat von philsMathis, verwende ich die "Google Translate"-Funktion in der Chrome Web-Browser, um den Text auf diesem Forum vom Deutschen ins Englische zu übersetzen.
Meine kurze Antwort ist, empfehle ich Ihnen, eine neue DNS-13-Profil erstellen und nicht den Upgrade-Profil von DNS 12.
Ich habe "Google Translate" zu meiner ursprünglichen englischen Text ins Deutsche zu übersetzen.
Ich hoffe, dass die Ergebnisse nicht zu albern oder unverständlich.
Phil Schaadt
Thank you very much. Google translater often "lies", but this message I could understand. Sincerly Mathis, Berlin.
DNS pro 13 (13.00.000.086) auch Win8.1 (64 Bit) Plantronics Audio 400 DSP Arcer Aspire V3-772G, Intel I7-4702Q, 2,2 GHz up to 3,2 GHz, 8 GB
... wer oder was, bitte? - Von denen ist keiner da, wir haben mit denen nichts zu tun, und sind freiwillig und ehrenamtlich hier!
_______________________________________
Dragon Professional 16 auf Windows 10 Pro und Windows 11 SpeechMike Premium (LFH3500); Office 2019 Pro + Office 365 (monatliches Abo) HP ZBook Fury 17 G8 - i7-11800H - 24 MB SmartCache - 32 GB RAM - 1 TB SSD
Ich denke, was ich gefunden habe heute Morgen wird für die deutschen Mitglieder des Forums mit vielen benutzerdefinierten Wörter diktieren ungewöhnlichen Gegenstand sein .
[Edit, in correct German, that would read, sorry for this:]
Ich denke, was ich heute Morgen heraus gefunden habe, wird für Anwender, die viele benutzerdefinierte Wörter verwenden, von Belang sein.
I think that machine translation will completely destroy the clarity
I went back and collected all of my 2013 analysis documents and ran those through Dragon also. That means for my technical dictation documentation fed through Dragon for analysis. it now totals ~1000 pages and almost 400,000 words.
I have not yet trained all of my custom words in the vocabulary editor and by doubling the amount of analysis documents, covering my systems integration business, I improved my recognition of unusual or neo-logism IT words (technobabble) by about one half of 1% over the recognition rate I had from just analyzing 500 pages of documents.
An important consideration is that almost all of my custom words have a "Written Form"\"Spoken Form" already in the Vocabulary Editor which I entered as an XML list prior to document analysis.
For those specific custom words which are short abbreviations with strange pronunciations (wazz=WAS, whizzer=WSRR zackemel=XACML, etc), approximately half of these words are now recognized after the second analysis. But almost all of them were recognized in my old profile where I trained all of my custom words individually in the version 12 Vocabulary Editor. For all of these abbreviations the strange pronunciations have a "Written Form"\"Spoken Form"in the version 13 vocabulary editor.
I'm still not back at 99% recognition for complex technical prose when using my best dictation style as I was in the 12 but I'm quite close.
The critical observation is that after I trained each one of those abbreviations with strange pronunciations, individually in the vocabulary editor, those abbreviations were immediately recognized when I dictated them either in context, as they would appear in the analysis text, or as individual standalone words.
Therefore, it seems that training the custom words individually and optimization after dictating technical documents will also be required for me to get my custom technobabble recognition back up to 99%.
I would expect this kind of result based on Rüdiger and Mark's work where they tested using Pro to recognize complex medical prose.
Therefore, my initial conclusion is that for colloquial American English speech, DNS 13 is ready to go out of the box with five minutes of training to adjust for any slight accent. I can't really test what would be necessary for a strong accent from a non-native speaker.
Dictating specialized prose for which there is no specialized language model (Unlike the Nuance supplied models for legal or medical) will still require a fair amount of effort (at least eight hours) to achieve a dictation rate equivalent to that of dictated colloquial speech.
today and tomorrow I will be completing a total of approximately 10 hours of technical dictation so I will be able to test any improvements from optimizing my profile separately from training my technical words. I am exporting my profile incrementally so I will be able to go back to each of the steps along the way to do any additional testing.