• Alibek Jakupov

Overview of modern approaches applied in conversational modeling (Part 2)

Updated: Nov 19, 2021

In the previous section we discussed that the current approach applied in modern digital assistants proves to be efficient when working with simple tasks like creating an alarm or making restaurant reservations.

But nevertheless when the task goes beyond the scope of the mere conversation modeling, the results produced by the technologies described above may not always match well with the user's expectations. For example, the most common approach based on the entities and intents requires manual training of a domain specific intent and slot model. As an illustration, Microsoft's PDAs are using a separate binary one-against all classfier for each domain. In case the user request is out scope of pre-trained domain range the system may probably

lose in performance [12]. As it has been mentioned above, a ranking-based approach may demonstrate a common sense knowledge, but such results require a neural network to be of a sufficiently large scale [14]. Moreover, ranking may be efficient even without `understanding' the dialog, as it uses semantical transformations and rule-based approach [1], [13]. Even Google's seq2seq that seems to be a perfect tool for conversational modeling may have certain serious difficulties. In the 1rst place, seq2seq is unable to update its long term memory. Thus such important things as the user's name will be forgotten after several iterations and consequently, no new facts can be memorized.

Equally important is the fact that seq2seq is not capable of obtaining data from the external sources. In addition, the knowledge obtained during the learning phase may not be applied to other domains and requires a lot of time and data to re-learn the new model. Furthermore, the model tends to give short responses like yes or no [13]. In the scenarios, where the user is faced with the challenge of being inundated with too much data and too little time to comprehend it all, the existing solutions are unlikely to be very efficient.

An example of such a scenario may be summarization systems for primary care physicians [9] where accuracy and speed are the key concerns. It is also important to realize that all the conversational agents described above are reactive ones, making it impossible to implement prescriptive analytics.

To sum up, we could claim that of particular interest is to develop the agent that:

  1. demonstrates domain specific knowledge;

  2. demonstrates common sense knowledge;

  3. does not require a help of human agent;

  4. does not require a large network to re-train;

  5. suggests decision options.


[1] D. Ameixa, L. Coheur, P. Fialho, and P. Quaresma. Luke, i am your father: Dealing with out-of-domain requests by using movies subtitles. In T. Bickmore, S. Marsella, and C. Sidner,

editors, Intelligent Virtual Agents, pages 1321, Cham, 2014. Springer International Publishing.

[2] A. S. Ashoor. Anomaly detection algorithm using multiagents. International Journal of Scientic and Technology Research, 1, 2012.

[3] P. S. Chaitanya Chemudugunta and M. Steyvers. Text modeling using unsupervised topic models and concept hierarchies. CoRR, abs/0808.0973, 2008.

[4] S. Chaudhuri, U. Dayal, and V. Narasayya. An overview of business intelligence technology. Commun. ACM, 54(8):88 98, Aug. 2011.

[5] M. I. J. David M. Blei, Andrew Y. Ng. Latent dirichlet allocation. Journal of Machine Learning Research, 3, 2003.

[6] J. A. Hansen, E. K. Ringger, and K. D. Seppi. Probabilistic explicit topic modeling using wikipedia. In I. Gurevych, C. Biemann, and T. Zesch, editors, Language Processing and Knowledge in theWeb, pages 6982, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg.

[7] M. Henderson, R. Al-Rfou, B. Strope, Y. Sung, L. Lukacs, R. Guo, S. Kumar, B. Miklos, and R. Kurzweil. Ecient natural language response suggestion for smart reply. CoRR,

abs/1705.00652, 2017.

[8] D. N. T. B. Jey Han Lau, Karl Grieser. Automatic labeling of topic models. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, page 15361545. Association for Computational Linguistics, 2011.

[9] D. M. A. L. S. R. S. Margalit R. S., Roter D. Electronic medical record use and physician-patient communication: an observational study of israeli primary care encounters. Patient Education and Counseling, 1, 2006.

[10] R. Mihalcea and P. Tarau. TextRank: Bringing order into texts. In Proceedings of EMNLP-04and the 2004 Conference on Empirical Methods in Natural Language Processing, July 2004.

[11] R. Sarikaya. The technology powering personal digital assistants. In INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden,

Germany, September 6-10, 2015, 2015.

[12] R. Sarikaya, P. A. Crook, A. Marin, M. Jeong, J.-P. Robichaud, A. Celikyilmaz, Y.-B. Kim, A. Rochette, O. Z. Khan, D. X. Liu, D. Boies, T. Anastasakos, Z. Feizollahi, N. Ramesh, H. Suzuki, R. Holenstein, E. Krawczyk, and V. Radostev. An overview of end-to-end language understanding and dialog management for personal digital assistants. IEEE, December 2016.

[13] D. Tarasov. В поисках разума: можно ли сделать “универсальный” чат-бот с помощью нейронных сетей?, October 2017.

[14] D. S. Tarasov and E. D. Izotova. Common sense knowledge in large scale neural conversational models. International Conference on Neuroinformatics. Springer, Cham, 736, 2017.

[15] O. Vinyals and Q. Le. A neural conversational model. 2015.

[16] G. Zhu and C. A. Iglesias. In SumPre 2015 - 1st International Workshop on Summarizing and Presenting Entities and Ontologies Co-located with the 12th Extended Semantic Web Conference, Portoroz, Slovenia, June. ESWC.

22 views0 comments