Publications

Under submission

Hassan, S. Z., Lison, P., & Halvorsen, P. (2024). Enhancing Naturalness in LLM-Generated Utterances through Disfluency Insertion. https://arxiv.org/abs/2412.12710	[BibTex]	[Abstract]
@misc{hassan2024enhancingnaturalnessllmgeneratedutterances, title={Enhancing Naturalness in LLM-Generated Utterances through Disfluency Insertion}, author={Syed Zohaib Hassan and Pierre Lison and Pål Halvorsen}, year={2024}, eprint={2412.12710}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2412.12710}, }
Disfluencies are a natural feature of spontaneous human speech but are typically absent from the outputs of Large Language Models (LLMs). This absence can diminish the perceived naturalness of synthesized speech, which is an important criteria when building conversational agents that aim to mimick human behaviours. We show how the insertion of disfluencies can alleviate this shortcoming. The proposed approach involves (1) fine-tuning an LLM with Low-Rank Adaptation (LoRA) to incorporate various types of disfluencies into LLM-generated utterances and (2) synthesizing those utterances using a text-to-speech model that supports the generation of speech phenomena such as disfluencies. We evaluated the quality of the generated speech across two metrics: intelligibility and perceived spontaneity. We demonstrate through a user study that the insertion of disfluencies significantly increase the perceived spontaneity of the generated speech. This increase came, however, along with a slight reduction in intelligibility.
Pilán, I., Manzanares-Salor, B., Sánchez, D., & Lison, P. (2024). Truthful Text Sanitization Guided by Inference Attacks. https://arxiv.org/abs/2412.12928	[BibTex]	[Abstract]
@misc{pilán2024truthfultextsanitizationguided, title={Truthful Text Sanitization Guided by Inference Attacks}, author={Ildikó Pilán and Benet Manzanares-Salor and David Sánchez and Pierre Lison}, year={2024}, eprint={2412.12928}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2412.12928}, }
The purpose of text sanitization is to rewrite those text spans in a document that may directly or indirectly identify an individual, to ensure they no longer disclose personal information. Text sanitization must strike a balance between preventing the leakage of personal information (privacy protection) while also retaining as much of the document's original content as possible (utility preservation). We present an automated text sanitization strategy based on generalizations, which are more abstract (but still informative) terms that subsume the semantic content of the original text spans. The approach relies on instruction-tuned large language models (LLMs) and is divided into two stages. The LLM is first applied to obtain truth-preserving replacement candidates and rank them according to their abstraction level. Those candidates are then evaluated for their ability to protect privacy by conducting inference attacks with the LLM. Finally, the system selects the most informative replacement shown to be resistant to those attacks. As a consequence of this two-stage process, the chosen replacements effectively balance utility and privacy. We also present novel metrics to automatically evaluate these two aspects without the need to manually annotate data. Empirical results on the Text Anonymization Benchmark show that the proposed approach leads to enhanced utility, with only a marginal increase in the risk of re-identifying protected individuals compared to fully suppressing the original information. Furthermore, the selected replacements are shown to be more truth-preserving and abstractive than previous methods.
Kennington, C., Lison, P., & Schlangen, D. (2025). Incremental Dialogue Management: Survey, Discussion, and Implications for HRI. https://arxiv.org/abs/2501.00953	[BibTex]	[Abstract]
@misc{kennington2025incrementaldialoguemanagementsurvey, title={Incremental Dialogue Management: Survey, Discussion, and Implications for HRI}, author={Casey Kennington and Pierre Lison and David Schlangen}, year={2025}, eprint={2501.00953}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2501.00953}, }
Efforts towards endowing robots with the ability to speak have benefited from recent advancements in NLP, in particular large language models. However, as powerful as current models have become, they still operate on sentence or multi-sentence level input, not on the word-by-word input that humans operate on, affecting the degree of responsiveness that they offer, which is critical in situations where humans interact with robots using speech. In this paper, we review the literature on interactive systems that operate incrementally (i.e., at the word level or below it). We motivate the need for incremental systems, survey incremental modeling of important aspects of dialogue like speech recognition and language generation. Primary focus is on the part of the system that makes decisions, known as the dialogue manager. We find that there is very little research on incremental dialogue management, offer some requirements for practical incremental dialogue management, and the implications of incremental dialogue for embodied, robotic platforms.
Walker, N. T., Ultes, S., & Lison, P. (2023). A Graph-to-Text Approach to Knowledge-Grounded Response Generation in Human-Robot Interaction. ArXiv Preprint ArXiv:2311.16137.	[BibTex]	[Abstract]
@article{walker2023graph, title={A Graph-to-Text Approach to Knowledge-Grounded Response Generation in Human-Robot Interaction}, author={Walker, Nicholas Thomas and Ultes, Stefan and Lison, Pierre}, journal={arXiv preprint arXiv:2311.16137}, year={2023} }
Knowledge graphs are often used to represent structured information in a flexible and efficient manner, but their use in situated dialogue remains under-explored. This paper presents a novel conversational model for human--robot interaction that rests upon a graph-based representation of the dialogue state. The knowledge graph representing the dialogue state is continuously updated with new observations from the robot sensors, including linguistic, situated and multimodal inputs, and is further enriched by other modules, in particular for spatial understanding. The neural conversational model employed to respond to user utterances relies on a simple but effective graph-to-text mechanism that traverses the dialogue state graph and converts the traversals into a natural language form. This conversion of the state graph into text is performed using a set of parameterized functions, and the values for those parameters are optimized based on a small set of Wizard-of-Oz interactions. After this conversion, the text representation of the dialogue state graph is included as part of the prompt of a large language model used to decode the agent response. The proposed approach is empirically evaluated through a user study with a humanoid robot that acts as conversation partner to evaluate the impact of the graph-to-text mechanism on the response generation. After moving a robot along a tour of an indoor environment, participants interacted with the robot using spoken dialogue and evaluated how well the robot was able to answer questions about what the robot observed during the tour. User scores show a statistically significant improvement in the perceived factuality of the robot responses when the graph-to-text approach is employed, compared to a baseline using inputs structured as semantic triples.
Papadopoulou, A., Lison, P., Anderson, M., Øvrelid, L., & Pilán, I. (2023). Neural Text Sanitization with Privacy Risk Indicators: An Empirical Analysis. ArXiv Preprint ArXiv:2310.14312.	[BibTex]	[Abstract]
@article{papadopoulou2023neural, title={Neural Text Sanitization with Privacy Risk Indicators: An Empirical Analysis}, author={Papadopoulou, Anthi and Lison, Pierre and Anderson, Mark and {\O}vrelid, Lilja and Pil{'a}n, Ildik{'o}}, journal={arXiv preprint arXiv:2310.14312}, year={2023} }
Text sanitization is the task of redacting a document to mask all occurrences of (direct or indirect) personal identifiers, with the goal of concealing the identity of the individual(s) referred in it. In this paper, we consider a two-step approach to text sanitization and provide a detailed analysis of its empirical performance on two recently published datasets: the Text Anonymization Benchmark (Pilan et al., 2022) and a collection of Wikipedia biographies (Papadopoulou et al., 2022). The text sanitization process starts with a privacy-oriented entity recognizer that seeks to determine the text spans expressing identifiable personal information. This privacy-oriented entity recognizer is trained by combining a standard named entity recognition model with a gazetteer populated by person-related terms extracted from Wikidata. The second step of the text sanitization process consists in assessing the privacy risk associated with each detected text span, either isolated or in combination with other text spans. We present five distinct indicators of the re-identification risk, respectively based on language model probabilities, text span classification, sequence labelling, perturbations, and web search. We provide a contrastive analysis of each privacy indicator and highlight their benefits and limitations, notably in relation to the available labeled data.

Journal articles

Manzanares-Salor, B., Sánchez, D., & Lison, P. (2024). Evaluating the disclosure risk of anonymized documents via a machine learning-based re-identification attack. Data Mining and Knowledge Discovery, 1–36.	[BibTex]	[Abstract]
@article{manzanares2024evaluating, title={Evaluating the disclosure risk of anonymized documents via a machine learning-based re-identification attack}, author={Manzanares-Salor, Benet and S{'a}nchez, David and Lison, Pierre}, journal={Data Mining and Knowledge Discovery}, pages={1--36}, year={2024}, publisher={Springer} }
The availability of textual data depicting human-centered features and behaviors is crucial for many data mining and machine learning tasks. However, data containing personal information should be anonymized prior making them available for secondary use. A variety of text anonymization methods have been proposed in the last years, which are standardly evaluated by comparing their outputs with human-based anonymizations. The residual disclosure risk is estimated with the recall metric, which quantifies the proportion of manually annotated re-identifying terms successfully detected by the anonymization algorithm. Nevertheless, recall is not a risk metric, which leads to several drawbacks. First, it requires a unique ground truth, and this does not hold for text anonymization, where several masking choices could be equally valid to prevent re-identification. Second, it relies on human judgements, which are inherently subjective and prone to errors. Finally, the recall metric weights terms uniformly, thereby ignoring the fact that the influence on the disclosure risk of some missed terms may be much larger than of others. To overcome these drawbacks, in this paper we propose a novel method to evaluate the disclosure risk of anonymized texts by means of an automated re-identification attack. We formalize the attack as a multi-class classification task and leverage state-of-the-art neural language models to aggregate the data sources that attackers may use to build the classifier. We illustrate the effectiveness of our method by assessing the disclosure risk of several methods for text anonymization under different attack configurations. Empirical results show substantial privacy risks for most existing anonymization methods.
Weitzenboeck, E., Lison, P., Cyndecka, M. & Langford, M. (2022) GDPR and unstructured data: is anonymization possible? International Data Privacy Law, 12(3).	[BibTex]	[Abstract]
@article{10.1093/idpl/ipac008, author = {Weitzenboeck, Emily M and Lison, Pierre and Cyndecka, Malgorzata and Langford, Malcolm}, title = "{The GDPR and unstructured data: is anonymization possible?}", journal = {International Data Privacy Law}, volume = {12}, number = {3}, pages = {184-206}, year = {2022}, month = {03}, issn = {2044-3994}, doi = {10.1093/idpl/ipac008}, url = {https://doi.org/10.1093/idpl/ipac008}, eprint = {https://academic.oup.com/idpl/advance-article-pdf/doi/10.1093/idpl/ipac008/42981177/ipac008.pdf}, }
Much of the legal and technical literature on data anonymization has focused on structured data such as tables. However, unstructured data such as text documents or images are far more common, and the legal requirements that must be fulfilled to properly anonymize such data formats remain unclear and underaddressed by the literature. In the absence of a definition of the term ‘anonymous data’ in the General Data Protection Regulation (GDPR), we examine its antithesis—personal data—and the identifiability test in Recital 26 GDPR to understand what conditions must be in place for the anonymization of unstructured data. This article examines the two contrasting approaches for determining identifiability that are prevalent today: (i) the risk-based approach and (ii) the strict approach in the Article 29 Working Party’s Opinion on Anonymization Techniques (WP 216). Through two case studies, we illustrate the challenges encountered when trying to anonymize unstructured datasets. We show that, while the risk-based approach offers a more nuanced test consistent with the purposes of the GDPR, the strict approach of WP 216 makes anonymization of unstructured data virtually impossible as long as the original data continues to exist. The concluding section considers the policy implications of the strict approach and technological developments that assist identification, and proposes a way forward.
Pilán, I., Lison, P, Øvrelid, L., Papadopoulou, A., Sánchez, D. & Batet, M. (2022) The Text Anonymization Benchmark (TAB): A Dedicated Corpus and Evaluation Framework for Text Anonymization. Computational Linguistics, 48(4): 1053-1101.	[BibTex]	[Abstract]
@article{10.1162/coli_a_00458, author = {Ildik{\'o} Pil{\'a}n and Pierre Lison and Lilja {\O}vrelid and Anthi Papadopoulou and David S{\'a}nchez and Montserrat Batet}, title = "{The Text Anonymization Benchmark (TAB): A Dedicated Corpus and Evaluation Framework for Text Anonymization}", journal = {Computational Linguistics}, volume = {48}, number = {4}, pages = {1053-1101}, year = {2022}, month = {12}, issn = {0891-2017}, doi = {10.1162/coli_a_00458}, url = {https://doi.org/10.1162/coli\_a\_00458}, eprint = {https://direct.mit.edu/coli/article-pdf/48/4/1053/2062009/coli\_a\_00458.pdf}, }
We present a novel benchmark and associated evaluation metrics for assessing the performance of text anonymization methods. Text anonymization, defined as the task of editing a text document to prevent the disclosure of personal information, currently suffers from a shortage of privacy-oriented annotated text resources, making it difficult to properly evaluate the level of privacy protection offered by various anonymization methods. This paper presents TAB (Text Anonymization Benchmark), a new, open-source annotated corpus developed to address this shortage. The corpus comprises 1,268 English-language court cases from the European Court of Human Rights (ECHR) enriched with comprehensive annotations about the personal information appearing in each document, including their semantic category, identifier type, confidential attributes, and co-reference relations. Compared to previous work, the TAB corpus is designed to go beyond traditional de-identification (which is limited to the detection of predefined semantic categories), and explicitly marks which text spans ought to be masked in order to conceal the identity of the person to be protected. Along with presenting the corpus and its annotation layers, we also propose a set of evaluation metrics that are specifically tailored towards measuring the performance of text anonymization, both in terms of privacy protection and utility preservation. We illustrate the use of the benchmark and the proposed metrics by assessing the empirical performance of several baseline text anonymization models. The full corpus along with its privacy-oriented annotation guidelines, evaluation scripts and baseline models are available at: https://github.com/norskregnesentral/text-anonymisation-benchmark.
Dragone, P. & Lison, P. (2016) Classification and Resolution of Non-Sentential Utterances in Dialogue. Italian Journal of Computational Linguistics, 2(1).	[BibTex]	[Abstract]
@article{ijcol2016, author={Paolo Dragone and Pierre Lison}, title = {Classification and Resolution of Non-Sentential Utterances in Dialogue}, journal={Italian Journal of Computational Linguistics}, year={2016}, volume={2}, number={1}, month={1} }
This article addresses the problems of classification and resolution of non-sentential utterances (NSUs) in dialogue. NSUs are utterances that do not have a complete sentential form but convey a full clausal meaning given the conversational context, such as "To the contrary!" or "How much?". The presented approach builds upon the work of Fernandez, Ginzburg, and Lappin (2007), who provide a taxonomy of NSUs divided in 15 classes along with a small annotated corpus extracted from dialogue transcripts. The main part of this article focuses on the automatic classification of NSUs according to these classes. We show that a combination of novel linguistic features and active learning techniques yields a significant improvement in the classification accuracy over the state-of-the-art, and is able to mitigate the scarcity of labelled data. Based on this classifier, the article also presents a novel approach for the semantic resolution of NSUs in context using probabilistic rules.
Lison, P. (2015) A hybrid approach to dialogue management based on probabilistic rules. Computer Speech & Language, 34(1):232-255.	[BibTex]	[Abstract]
@article{Lison2015, title = "A hybrid approach to dialogue management based on probabilistic rules", journal = "Computer Speech \& Language ", volume = "34", number = "1", pages = "232 - 255", year = "2015", issn = "0885-2308", author = "Pierre Lison" }
We present a new modelling framework for dialogue management based on the concept of probabilistic rules. Probabilistic rules are defined as structured mappings between logical conditions and probabilistic effects. They function as high-level templates for probabilistic graphical models and may include unknown parameters whose values are estimated from data using Bayesian inference. Thanks to their use of logical abstractions, probabilistic rules are able to encode the probability and utility models employed in dialogue management in a compact and human-readable form. As a consequence, they can reduce the amount of dialogue data required for parameter estimation and allow system designers to directly incorporate their expert domain knowledge into the dialogue models. Empirical results of a user evaluation in a human-robot interaction task with 37 participants show that a dialogue manager structured with probabilistic rules outperforms both purely hand-crafted and purely statistical methods on a range of subjective and objective quality metrics. The framework is implemented in a software toolkit called OpenDial, which can be used to develop various types of dialogue systems based on probabilistic rules.
Lison, P. & Meena, R. (2014) Spoken Dialogue Systems: The New Frontier in Human-computer Interaction. XRDS: Crossroads, 21(1):46-51, ACM.	[BibTex]
@article{Lison:2014:SDS:2677339.2659891, author = {Lison, Pierre and Meena, Raveesh}, title = {Spoken Dialogue Systems: The New Frontier in Human-computer Interaction}, journal = {XRDS: Crossroads}, issue_date = {Fall 2014}, volume = {21}, number = {1}, month = oct, year = {2014}, issn = {1528-4972}, pages = {46--51}, numpages = {6}, acmid = {2659891}, publisher = {ACM}, address = {New York, NY, USA}, }

Wyatt, J., Aydemir, A., Brenner, M., Hanheide, M., Hawes, N., Jensfelt, P., Kristan, M., Kruijff, G.J., Lison, P., Pronobis, A., Sjöö;, K., Skočaj, D., Vrečko, A., Zender, H. & Zillich, M. (2010) Self-Understanding & Self-Extension: A Systems and Representational Approach. IEEE Transactions on Autonomous Mental Development, 2(4):282-303.	[BibTex]	[Abstract]
@article{tamd-architecture, Author = {Jeremy Wyatt and Alper Aydemir and Michael Brenner and Marc Hanheide and Nick Hawes and Patric Jensfelt and Matej Kristan and Geert-Jan Kruijff and Pierre Lison and Andrzej Pronobis and Kristoffer Sj\"{o}\"{o} and Danijel Sko\v{c}aj and Alen Vre\v{c}ko and Hendrik Zender and Michael Zillich}, Journal = {IEEE Transactions on Autonomous Mental Development}, Month = {December}, Number = {4}, Pages = {282-303}, Title = {Self-Understanding \& Self-Extension: A Systems and Representational Approach}, Volume = {2}, Year = {2010}}
There are many different approaches to building a system that can engage in autonomous mental development. In this paper we present an approach based on what we term self- understanding, by which we mean the use of explicit representation of and reasoning about what a system does and doesn't know, and how that understanding changes under action. We present a coherent architecture and a set of representations used in two robot systems that exhibit a limited degree of autonomous mental development, what we term self-extension. The contributions include: representations of gaps and uncertainty for specific kinds of knowledge, and a motivational and planning system for setting and achieving learning goals.

Conference papers

Pilan, I., Prévot, L., Buschmeier, H., & Lison, P. (2024). Conversational Feedback in Scripted versus Spontaneous Dialogues: A Comparative Analysis. In T. Kawahara, V. Demberg, S. Ultes, K. Inoue, S. Mehri, D. Howcroft, & K. Komatani (Eds.), Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue (pp. 440–457). Association for Computational Linguistics.	[BibTex]	[Abstract]
@inproceedings{pilan-etal-2024-conversational, title = "Conversational Feedback in Scripted versus Spontaneous Dialogues: A Comparative Analysis", author = "Pilan, Ildiko and Pr{'e}vot, Laurent and Buschmeier, Hendrik and Lison, Pierre", editor = "Kawahara, Tatsuya and Demberg, Vera and Ultes, Stefan and Inoue, Koji and Mehri, Shikib and Howcroft, David and Komatani, Kazunori", booktitle = "Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue", month = sep, year = "2024", address = "Kyoto, Japan", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.sigdial-1.38", doi = "10.18653/v1/2024.sigdial-1.38", pages = "440--457", abstract = "Scripted dialogues such as movie and TV subtitles constitute a widespread source of training data for conversational NLP models. However, there are notable linguistic differences between these dialogues and spontaneous interactions, especially regarding the occurrence of communicative feedback such as backchannels, acknowledgments, or clarification requests. This paper presents a quantitative analysis of such feedback phenomena in both subtitles and spontaneous conversations. Based on conversational data spanning eight languages and multiple genres, we extract lexical statistics, classifications from a dialogue act tagger, expert annotations and labels derived from a fine-tuned Large Language Model (LLM). Our main empirical findings are that (1) communicative feedback is markedly less frequent in subtitles than in spontaneous dialogues and (2) subtitles contain a higher proportion of negative feedback. We also show that dialogues generated by standard LLMs lie much closer to scripted dialogues than spontaneous interactions in terms of communicative feedback.", }
Scripted dialogues such as movie and TV subtitles constitute a widespread source of training data for conversational NLP models. However, there are notable linguistic differences between these dialogues and spontaneous interactions, especially regarding the occurrence of communicative feedback such as backchannels, acknowledgments, or clarification requests. This paper presents a quantitative analysis of such feedback phenomena in both subtitles and spontaneous conversations. Based on conversational data spanning eight languages and multiple genres, we extract lexical statistics, classifications from a dialogue act tagger, expert annotations and labels derived from a fine-tuned Large Language Model (LLM). Our main empirical findings are that (1) communicative feedback is markedly less frequent in subtitles than in spontaneous dialogues and (2) subtitles contain a higher proportion of negative feedback. We also show that dialogues generated by standard LLMs lie much closer to scripted dialogues than spontaneous interactions in terms of communicative feedback.
Walker, N., Ultes, S., & Lison, P. (2023). Retrieval-Augmented Neural Response Generation Using Logical Reasoning and Relevance Scoring. Proceedings of the 27th Workshop on the Semantics and Pragmatics of Dialogue.	[BibTex]	[Abstract]
@inproceedings{walker-etal-2023-retrieval, title = "Retrieval-Augmented Neural Response Generation Using Logical Reasoning and Relevance Scoring", author = "Walker, Nicholas and Ultes, Stefan and Lison, Pierre", booktitle = "Proceedings of the 27th Workshop on the Semantics and Pragmatics of Dialogue", year = "2023", address = "Maribor, Slovenia" }
Constructing responses in task-oriented dialogue systems typically relies on information sources such the current dialogue state or external databases. This paper presents a novel approach to knowledge-grounded response generation that combines retrieval-augmented language models with logical reasoning. The approach revolves around a knowledge graph representing the current dialogue state and background information, and proceeds in three steps. The knowledge graph is first enriched with logically derived facts inferred using probabilistic logical programming. A neural model is then employed at each turn to score the conversational relevance of each node and edge of this extended graph. Finally, the elements with highest relevance scores are converted to a natural language form, and are integrated into the prompt for the neural conversational model employed to generate the system response. We investigate the benefits of the proposed approach on two datasets (KVRET and GraphWOZ) along with a human evaluation. Experimental results show that the combination of (probabilistic) logical reasoning with conversational relevance scoring does increase both the factuality and fluency of the responses.
Barnes, J., Touileb, S., Mæhlum, P., & Lison, P. (2023). Identifying Token-Level Dialectal Features in Social Media. In T. Alumäe & M. Fishel (Eds.), Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa) (pp. 146–158). University of Tartu Library.	[BibTex]	[Abstract]
@inproceedings{barnes-etal-2023-identifying, title = "Identifying Token-Level Dialectal Features in Social Media", author = "Barnes, Jeremy and Touileb, Samia and M{e}hlum, Petter and Lison, Pierre", editor = {Alum{"a}e, Tanel and Fishel, Mark}, booktitle = "Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)", month = may, year = "2023", address = "T{'o}rshavn, Faroe Islands", publisher = "University of Tartu Library", pages = "146--158", }
Dialectal variation is present in many human languages and is attracting a growing interest in NLP. Most previous work concentrated on either (1) classifying dialectal varieties at the document or sentence level or (2) performing standard NLP tasks on dialectal data. In this paper, we propose the novel task of token-level dialectal feature prediction. We present a set of fine-grained annotation guidelines for Norwegian dialects, expand a corpus of dialectal tweets, and manually annotate them using the introduced guidelines. Furthermore, to evaluate the learnability of our task, we conduct labeling experiments using a collection of baselines, weakly supervised and supervised sequence labeling models. The obtained results show that, despite the difficulty of the task and the scarcity of training data, many dialectal features can be predicted with reasonably high accuracy.
Olstad, A. W., Papadopoulou, A., & Lison, P. (2023). Generation of Replacement Options in Text Sanitization. In T. Alumäe & M. Fishel (Eds.), Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa) (pp. 292–300). University of Tartu Library.	[BibTex]	[Abstract]
@inproceedings{olstad-etal-2023-generation, title = "Generation of Replacement Options in Text Sanitization", author = "Olstad, Annika Willoch and Papadopoulou, Anthi and Lison, Pierre", editor = {Alum{"a}e, Tanel and Fishel, Mark}, booktitle = "Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)", year = "2023", address = "T{'o}rshavn, Faroe Islands", publisher = "University of Tartu Library", pages = "292--300", }
The purpose of text sanitization is to edit text documents to mask text spans that may directly or indirectly reveal personal information. An important problem in text sanitization is to find less specific, yet still informative replacements for each text span to mask. We present an approach to generate possible replacements using a combination of heuristic rules and an ontology derived from Wikidata. Those replacement options are hierarchically structured and cover various types of personal identifiers. Using this approach, we extend a recently released text sanitization dataset with manually selected replacements. The outcome of this data collection shows that the approach is able to suggest appropriate replacement options for most text spans.
Høst, A., Lison, P., & Moonen, L. (2023). Constructing a Knowledge Graph from Textual Descriptions of Software Vulnerabilities in the National Vulnerability Database. In T. Alumäe & M. Fishel (Eds.), Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa) (pp. 386–391). University of Tartu Library.	[BibTex]	[Abstract]
@inproceedings{host-etal-2023-constructing, title = "Constructing a Knowledge Graph from Textual Descriptions of Software Vulnerabilities in the National Vulnerability Database", author = "H{\o}st, Anders and Lison, Pierre and Moonen, Leon", editor = {Alum{"a}e, Tanel and Fishel, Mark}, booktitle = "Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)", month = may, year = "2023", address = "T{'o}rshavn, Faroe Islands", publisher = "University of Tartu Library", pages = "386--391", }
Knowledge graphs have shown promise for several cybersecurity tasks, such as vulnerability assessment and threat analysis. In this work, we present a new method for constructing a vulnerability knowledge graph from information in the National Vulnerability Database (NVD). Our approach combines named entity recognition (NER), relation extraction (RE), and entity prediction using a combination of neural models, heuristic rules, and knowledge graph embeddings. We demonstrate how our method helps to fix missing entities in knowledge graphs used for cybersecurity and evaluate the performance.
Lison, P., & Kennington, C. (2023). Who’s in Charge? Roles and Responsibilities of Decision-Making Components in Conversational Robots. Proceedings of the HRI 2023 Workshop on Human-Robot Conversational Interaction.	[BibTex]	[Abstract]
@inproceedings{lison2023s, title={Who's in Charge? Roles and Responsibilities of Decision-Making Components in Conversational Robots}, author={Lison, Pierre and Kennington, Casey}, booktitle={Proceedings of the HRI 2023 Workshop on Human-Robot Conversational Interaction}, year={2023} }
Software architectures for conversational robots typically consist of multiple modules, each designed for a particular processing task or functionality. Some of these modules are developed for the purpose of making decisions about the next action that the robot ought to perform in the current context. Those actions may relate to physical movements, such as driving forward or grasping an object, but may also correspond to communicative acts, such as asking a question to the human user. In this position paper, we reflect on the organization of those decision modules in human-robot interaction platforms. We discuss the relative benefits and limitations of modular vs. end-to-end architectures, and argue that, despite the increasing popularity of end-to-end approaches, modular architectures remain preferable when developing conversational robots designed to execute complex tasks in collaboration with human users. We also show that most practical HRI architectures tend to be either robot-centric or dialogue-centric, depending on where developers wish to place the ``command center'' of their system. While those design choices may be justified in some application domains, they also limit the robot's ability to flexibly interleave physical movements and conversational behaviours. We contend that architectures placing ``action managers'' and ``interaction managers'' on an equal footing may provide the best path forward for future human-robot interaction systems.
Walker, N. T., Ultes, S., & Lison, P. (2023). GraphWOZ: Dialogue Management with Conversational Knowledge Graphs. Proceedings of the 13th International Workshop on Spoken Dialogue System Technology (IWSDS 2023).	[BibTex]	[Abstract]
@inproceedings{walker2023graphwoz, title={GraphWOZ: Dialogue Management with Conversational Knowledge Graphs}, author={Walker, Nicholas Thomas and Ultes, Stefan and Lison, Pierre}, booktitle={Proceedings of the 13th International Workshop on Spoken Dialogue System Technology (IWSDS 2023)}, year={2023} }
We present a new approach to dialogue management using conversational knowledge graphs as core representation of the dialogue state. To this end, we introduce a new dataset, GraphWOZ, which comprises Wizard-of-Oz dialogues in which human participants interact with a robot acting as a receptionist. In contrast to most existing work on dialogue management, GraphWOZ relies on a dialogue state explicitly represented as a dynamic knowledge graph instead of a fixed set of slots. This graph is composed of a varying number of entities (such as individuals, places, events, utterances and mentions) and relations between them (such as persons being part of a group or attending an event). The graph is then regularly updated on the basis of new observations and system actions. GraphWOZ is released along with detailed manual annotations related to the user intents, system responses, and reference relations occurring in both user and system turns. Based on GraphWOZ, we present experimental results for two dialogue management tasks, namely conversational entity linking and response ranking. For conversational entity linking, we show how to connect utterance mentions to their corresponding entity in the knowledge graph with a neural model relying on a combination of both string and graph-based features. Response ranking is then performed by summarizing the relevant content of the graph into a text, which is concatenated with the dialogue history and employed as input to score possible responses to a given dialogue state.
Papadopoulou, Anthi, Yu, Yunhao, Lison, Pierre and Øvrelid, Lilja (2022) Neural Text Sanitization with Explicit Measures of Privacy Risk. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).	[BibTex]	[Abstract]
@inproceedings{papadopoulou-etal-2022-neural, title = "Neural Text Sanitization with Explicit Measures of Privacy Risk", author = "Papadopoulou, Anthi and Yu, Yunhao and Lison, Pierre and {\O}vrelid, Lilja", booktitle = "Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)", month = nov, year = "2022", address = "Online only", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.aacl-main.18", pages = "217--229" }
We present a novel approach for text sanitization, which is the task of editing a document to mask all (direct and indirect) personal identifiers and thereby conceal the identity of the individuals(s) mentioned in the text. In contrast to previous work, the approach relies on explicit measures of privacy risk, making it possible to explicitly control the trade-off between privacy protection and data utility. The approach proceeds in three steps. A neural, privacy-enhanced entity recognizer is first employed to detect and classify potential personal identifiers. We then determine which entities, or combination of entities, are likely to pose a re-identification risk through a range of privacy risk assessment measures. We present three such measures of privacy risk, respectively based on (1) span probabilities derived from a BERT language model, (2) web search queries and (3) a classifier trained on labelled data. Finally, a linear optimization solver decides which entities to mask to minimize the semantic loss while simultaneously ensuring that the estimated privacy risk remains under a given threshold. We evaluate the approach both in the absence and presence of manually annotated data. Our results highlight the potential of the approach, as well as issues specific types of personal data can introduce to the process.
Manzanares-Salor, B., Sánchez, D., & Lison, P. (2022). Automatic Evaluation of Disclosure Risks of Text Anonymization Methods. International Conference on Privacy in Statistical Databases, 157–171.	[BibTex]	[Abstract]
@inproceedings{manzanares2022automatic, title={Automatic Evaluation of Disclosure Risks of Text Anonymization Methods}, author={Manzanares-Salor, Benet and S{'a}nchez, David and Lison, Pierre}, booktitle={International Conference on Privacy in Statistical Databases}, pages={157--171}, year={2022}, organization={Springer} }
The standard approach to evaluate text anonymization methods consists of comparing their outcomes with the anonymization performed by human experts. The degree of privacy protection attained is then measured with the IR-based recall metric, which expresses the proportion of re-identifying terms that were correctly detected by the anonymization method. However, the use of recall to estimate the degree of privacy protection suffers from several limitations. The first is that it assigns a uniform weight to each re-identifying term, thereby ignoring the fact that some missed re-identifying terms may have a larger influence on the disclosure risk than others. Furthermore, IR-based metrics assume the existence of a single gold standard annotation. This assumption does not hold for text anonymization, where several maskings (each one encompassing a different combination of terms) could be equally valid to prevent disclosure. Finally, those metrics rely on manually anonymized datasets, which are inherently subjective and may be prone to various errors, omissions and inconsistencies. To tackle these issues, we propose an automatic re-identification attack for (anonymized) texts that provides a realistic assessment of disclosure risks. Our method follows a similar premise as the well-known record linkage methods employed to evaluate anonymized structured data, and leverages state-of-the-art deep learning language models to exploit the background knowledge available to potential attackers. We also report empirical evaluations of several well-known methods and tools for text anonymization. Results show significant re-identification risks for all methods, including also manual anonymization efforts.
Papadopoulou, Anthi, Lison, Pierre, Øvrelid, Lilja and Pilán, Ildikó (2022) Bootstrapping Text Anonymization Models with Distant Supervision . In Proceedings of the Language Resources and Evaluation Conference. ELRA, Marseille, France.	[BibTex]	[Abstract]
@InProceedings{papadopoulou-EtAl:2022:LREC, author = {Papadopoulou, Anthi and Lison, Pierre and {\O}vrelid, Lilja and Pil{\'a}n, Ildik{\'o}}, title = {Bootstrapping Text Anonymization Models with Distant Supervision}, booktitle = {Proceedings of the Language Resources and Evaluation Conference}, month = {June}, year = {2022}, address = {Marseille, France}, publisher = {European Language Resources Association}, pages = {4477--4487}, url = {https://aclanthology.org/2022.lrec-1.476} }
We propose a novel method to bootstrap text anonymization models based on distant supervision. Instead of requiring manually labeled training data, the approach relies on a knowledge graph expressing the background information assumed to be publicly available about various individuals. This knowledge graph is employed to automatically annotate text documents including personal data about a subset of those individuals. More precisely, the method determines which text spans ought to be masked in order to guarantee k-anonymity, assuming an adversary with access to both the text documents and the background information expressed in the knowledge graph. The resulting collection of labeled documents is then used as training data to fine-tune a pre-trained language model for text anonymization. We illustrate this approach using a knowledge graph extracted from Wikidata and short biographical texts from Wikipedia. Evaluation results with a RoBERTa-based model and a manually annotated collection of 553 summaries showcase the potential of the approach, but also unveil a number of issues that may arise if the knowledge graph is noisy or incomplete. The results also illustrate that, contrary to most sequence labeling problems, the text anonymization task may admit several alternative solutions.
Walker, Nicholas, Dahl, Torbjørn and Lison, Pierre (2022) Dialogue Management as Graph Transformations . In Conversational AI for Natural Human-Centric Interaction (Proceedings of the 12th International Workshop on Spoken Dialogue System Technology). IWSDS 2021, Singapore.	[BibTex]	[Abstract]
@InProceedings{iwsds2021, author = {Walker, Nicholas and Dahl, Torbj{\o}rn and Lison, Pierre}, title = {Dialogue Management as Graph Transformations}, booktitle = {Conversational AI for Natural Human-Centric Interaction (Proceedings of the 12th International Workshop on Spoken Dialogue System Technology)}, year = {2022}, publisher = {Springer}, url = {https://home.nr.no/~plison/pdfs/cl/IWSDS_2021.pdf} }
We present ongoing work on a new dialogue management framework using graphs as core representations for the current dialogue state. Dialogue management tasks such as state tracking and action selection are framed as sequences of graph transformations that repeatedly update this graph based on incoming observations. Those graph transformations are expressed using a graph query language, making it possible to specify all dialogue management operations through a unified, declarative syntax. We argue that graphs are particularly well-suited to model the dialogue state of complex, open-ended domains. In contrast to traditional state representations that are limited to fixed, predefined slots, graphs can naturally express dialogue domains with rich relational structures and variable numbers of entities to track. We describe how dialogue state tracking and action selection can be practically modelled in such graph-centric view of dialogue management, using either handcrafted rules or data-driven models (or a combination of both). We also briefly discuss how to account for some aspects of dialogue management such as uncertainties, incremental inputs and contextual knowledge. Finally, we describe a proof-of-concept study of this dialogue management framework in a human-robot interaction scenario.
Hassan, S. Z., Salehi, P., Røed, R. K., Halvorsen, P., Baugerud, G. A., Johnson, M. S., Lison, P., Riegler, M., Lamb, M. E., Griwodz, C., & Sabet, S. S. (2022). Towards an AI-Driven Talking Avatar in Virtual Reality for Investigative Interviews of Children. Proceedings of the 2nd Workshop on Games Systems, 9–15.	[BibTex]	[Abstract]
@inproceedings{10.1145/3534085.3534340, author = {Hassan, Syed Zohaib and Salehi, Pegah and R\o{}ed, Ragnhild Klingenberg and Halvorsen, P\r{a}l and Baugerud, Gunn Astrid and Johnson, Miriam Sinkerud and Lison, Pierre and Riegler, Michael and Lamb, Michael E. and Griwodz, Carsten and Sabet, Saeed Shafiee}, title = {Towards an AI-Driven Talking Avatar in Virtual Reality for Investigative Interviews of Children}, year = {2022}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, booktitle = {Proceedings of the 2nd Workshop on Games Systems}, pages = {9–15}, numpages = {7}, keywords = {child protection services (CPS), dialogue model, virtual reality (VR), generative adversarial networks (GANs), quality of experience (QoE), AI, avatar}, location = {Athlone, Ireland}, series = {GameSys '22} }
Artificial intelligence (AI) and gaming systems have advanced to the stage where the current models and technologies can be used to address real-world problems. The development of such systems comes with different challenges, e.g., most of them related to system performance, complexity and user testing. Using a virtual reality (VR) environment, we have designed and developed a game-like system aiming to mimic an abused child that can help to assist police and child protection service (CPS) personnel in interview training of maltreated children. Current research in this area points to the poor quality of conducted interviews, and emphasises the need for better training methods. Information obtained in these interviews is the core piece of evidence in the prosecution process. We utilised advanced dialogue models, talking visual avatars, and VR to build a virtual child avatar that can interact with users. We discuss our proposed architecture and the performance of the developed child avatar prototype, and we present the results from the user study conducted with CPS personnel. The user study investigates the users' perceived quality of experience (QoE) and their learning effects. Our study confirms that such a gaming system can increase the knowledge and skills of the users. We also benchmark and discuss the system performance aspects of the child avatar. Our results show that the proposed prototype works well in practice and is well received by the interview experts.
Lison, P, Pilán, I., Sánchez, D., Batet, M. and Øvrelid, L. (2021) Anonymisation Models for Text Data: State of the art, Challenges and Future Directions. In Proceedings of the 2021 Annual Conference of the Association for Computational Linguistics (ACL 2021).	[BibTex]	[Abstract]
@inproceedings{lison-etal-2021-anonymisation, title = "Anonymisation Models for Text Data: State of the art, Challenges and Future Directions", author = "Lison, Pierre and Pil{\'a}n, Ildik{\'o} and Sanchez, David and Batet, Montserrat and {\O}vrelid, Lilja", booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.acl-long.323", doi = "10.18653/v1/2021.acl-long.323", pages = "4188--4203" }
This position paper investigates the problem of automated text anonymisation, which is a prerequisite for secure sharing of documents containing sensitive information about individuals. We summarise the key concepts behind text anonymisation and provide a review of current approaches. Anonymisation methods have so far been developed in two fields with little mutual interaction, namely natural language processing and privacy-preserving data publishing. Based on a case study, we outline the benefits and limitations of these approaches and discuss a number of open challenges, such as (1) how to account for multiple types of semantic inferences, (2) how to strike a balance between disclosure risk and data utility and (3) how to evaluate the quality of the resulting anonymisation. We lay out a case for moving beyond sequence labelling models and incorporate explicit measures of disclosure risk into the text anonymisation process.dThis position paper investigates the problem of automated text anonymisation, which is a prerequisite for secure sharing of documents containing sensitive information about individuals. We summarise the key concepts behind text anonymisation and provide a review of current approaches. Anonymisation methods have so far been developed in two fields with little mutual interaction, namely natural language processing and privacy-preserving data publishing. Based on a case study, we outline the benefits and limitations of these approaches and discuss a number of open challenges, such as (1) how to account for multiple types of semantic inferences, (2) how to strike a balance between disclosure risk and data utility and (3) how to evaluate the quality of the resulting anonymisation. We lay out a case for moving beyond sequence labelling models and incorporate explicit measures of disclosure risk into the text anonymisation process.
Lison, P, Barnes, J. and Hubin, A. (2021) skweak: Weak Supervision Made Easy for NLP. In Proceedings of the 2021 Annual Conference of the Association for Computational Linguistics (ACL 2021, Demonstrations).	[BibTex]	[Abstract]
@inproceedings{lison-etal-2021-skweak, title = "skweak: Weak Supervision Made Easy for {NLP}", author = "Lison, Pierre and Barnes, Jeremy and Hubin, Aliaksandr", booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.acl-demo.40", doi = "10.18653/v1/2021.acl-demo.40", pages = "337--346" }
We present skweak, a versatile, Python-based software toolkit enabling NLP developers to apply weak supervision to a wide range of NLP tasks. Weak supervision is an emerging machine learning paradigm based on a simple idea: instead of labelling data points by hand, we use labelling functions derived from domain knowledge to automatically obtain annotations for a given dataset. The resulting labels are then aggregated with a generative model that estimates the accuracy (and possible confusions) of each labelling function. The skweak toolkit makes it easy to implement a large spectrum of labelling functions (such as heuristics, gazetteers, neural models or linguistic constraints) on text data, apply them on a corpus, and aggregate their results in a fully unsupervised fashion. skweak is especially designed to facilitate the use of weak supervision for NLP tasks such as text classification and sequence labelling. We illustrate the use of skweak for NER and sentiment analysis. skweak is released under an open-source license and is available at https://github.com/NorskRegnesentral/skweak
Olsen, J., Næss, A. B. and Lison, P. (2021) Assessing the Quality of Human-Generated Summaries with Weakly Supervised Learning. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa).	[BibTex]	[Abstract]
@inproceedings{olsen-etal-2021-assessing, title = "Assessing the Quality of Human-Generated Summaries with Weakly Supervised Learning", author = "Olsen, Joakim and N{\ae}ss, Arild Brandrud and Lison, Pierre", booktitle = "Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)", month = may # " 31--2 " # jun, year = "2021", address = "Reykjavik, Iceland (Online)", publisher = {Link{\"o}ping University Electronic Press, Sweden}, url = "https://aclanthology.org/2021.nodalida-main.12", pages = "112--123" }
This paper explores how to automatically measure the quality of human-generated summaries, based on a Norwegian corpus of real estate condition reports and their corresponding summaries. The proposed approach proceeds in two steps. First, the real estate reports and their associated summaries are automatically labelled using a set of heuristic rules gathered from human experts and aggregated using weak supervision. The aggregated labels are then employed to learn a neural model that takes a document and its summary as inputs and outputs a score reflecting the predicted quality of the summary. The neural model maps the document and its summary to a shared {``}summary content space{''} and computes the cosine similarity between the two document embeddings to predict the final summary quality score. The best performance is achieved by a CNN-based model with an accuracy (measured against the aggregated labels obtained via weak supervision) of 89.5{\%}, compared to 72.6{\%} for the best unsupervised model. Manual inspection of examples indicate that the weak supervision labels do capture important indicators of summary quality, but the correlation of those labels with human judgements remains to be validated. Our models of summary quality predict that approximately 30{\%} of the real estate reports in the corpus have a summary of poor quality.
Lison, P, Barnes, J., Hubin, A. and Touileb, S. (2020) Named Entity Recognition without Labelled Data: A Weak Supervision Approach. In Proceedings of the 2020 Annual Conference of the Association for Computational Linguistics (ACL 2020).	[BibTex]	[Abstract]
@inproceedings{lison-etal-2020-named, title = "Named Entity Recognition without Labelled Data: A Weak Supervision Approach", author = "Lison, Pierre and Barnes, Jeremy and Hubin, Aliaksandr and Touileb, Samia", booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics", month = jul, year = "2020", address = "Online", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/2020.acl-main.139", pages = "1518--1533" }
Named Entity Recognition (NER) performance often degrades rapidly when applied to target domains that differ from the texts observed during training. When in-domain labelled data is available, transfer learning techniques can be used to adapt existing NER models to the target domain. But what should one do when there is no hand-labelled data for the target domain? This paper presents a simple but powerful approach to learn NER models in the absence of labelled data through weak supervision. The approach relies on a broad spectrum of labelling functions to automatically annotate texts from the target domain. These annotations are then merged together using a hidden Markov model which captures the varying accuracies and confusions of the labelling functions. A sequence labelling model can finally be trained on the basis of this unified annotation. We evaluate the approach on two English datasets (CoNLL 2003 and news articles from Reuters and Bloomberg) and demonstrate an improvement of about 7 percentage points in entity-level F1 scores compared to an out-of-domain neural NER model.
Jang, Y., Lee, J., Park, J., Lee, K., Lison, P. and Kim, K.-E. (2019) PyOpenDial: A Python-based Domain-Independent Toolkit for Developing Spoken Dialogue Systems with Probabilistic Rules. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations, Hong Kong, China.	[BibTex]	[Abstract]
@inproceedings{jang-etal-2019-pyopendial, title = "{P}y{O}pen{D}ial: A Python-based Domain-Independent Toolkit for Developing Spoken Dialogue Systems with Probabilistic Rules", author = "Jang, Youngsoo and Lee, Jongmin and Park, Jaeyoung and Lee, Kyeng-Hun and Lison, Pierre and Kim, Kee-Eung", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations", month = nov, year = "2019", address = "Hong Kong, China", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/D19-3032", doi = "10.18653/v1/D19-3032", pages = "187--192" }
We present PyOpenDial, a Python-based domain-independent, open-source toolkit for spoken dialogue systems. Recent advances in core components of dialogue systems, such as speech recognition, language understanding, dialogue management, and language generation, harness deep learning to achieve state-of-the-art performance. The original OpenDial, implemented in Java, provides a plugin architecture to integrate external modules, but lacks Python bindings, making it difficult to interface with popular deep learning frameworks such as Tensorflow or PyTorch. To this end, we re-implemented OpenDial in Python and extended the toolkit with a number of novel functionalities for neural dialogue state tracking and action planning. We describe the overall architecture and its extensions, and illustrate their use on an example where the system response model is implemented with a recurrent neural network.
Prévot, L.. and Magistry, P. and Lison, P. (2019) Should we use movie subtitles to study linguistic patterns of conversational speech? A study based on French, English and Taiwan Mandarin. In International Symposium on Linguistic Patterns of Spontaneous Speech, Taipei, Taiwan.	[BibTex]	[Abstract]
@inproceedings{lpss2019, title = "Should we use movie subtitles to study linguistic patterns of conversational speech? A study based on {F}rench, {E}nglish and {T}aiwan {M}andarin", author = "Laurent Prévot and Pierre Magistry and Pierre Lison", booktitle = "Third International Symposium on Linguitic Patters of Spontaneous Speech", year = "2019", address = "Taipei, Taiwan" }
Linguistic research benefits from the wide range of resources and software tools developed for natural language processing (NLP) tasks. However, NLP has a strong historical bias towards written language, thereby making these resources and tools often inadequate to address research questions related to the linguistic patterns of spontaneous speech. In this preliminary study, we investigate whether corpora of movie and TV subtitles can be employed to estimate data-driven NLP models adapted to conversational speech. In particular, the presented work explore lexical and syntactic distributional aspects across three genres (conversational, written and subtitles) and three languages (French, English and Taiwan Mandarin). Ongoing work focuses on comparing these three genres on the basis of deeper syntactic conversational patterns , using graph-based modelling and visualisation.
Lison, P., Tiedemann, J. & Kouylekov, M. (2018) OpenSubtitles 2018: Statistical rescoring of sentence alignments in large, noisy parallel corpora. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC-2018).	[BibTex]	[Abstract]
@InProceedings{lrec2018, author= {Pierre Lison and J{\"o}rg Tiedemann and Milen Kouylekov}, title = {{OpenSubtitles} 2018: Statistical rescoring of sentence alignments in large, noisy parallel corpora}, booktitle = {Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC-2018)}, address = {Miyazaki, Japan}, year = {2018} }
Movie and TV subtitles are a highly valuable resource for the compilation of parallel corpora thanks to their availability in large numbers and across many languages. However, the quality of the resulting sentence alignments is often lower than for other parallel corpora. This paper presents a new major release of the OpenSubtitles collection of parallel corpora, which is extracted from a total of 3.7 million subtitles spread over 60 languages. In addition to a substantial increase in the corpus size (about 30 % compared to the previous version), this new release associates explicit quality scores to each sentence alignment. These scores are determined by a statistical regression model based on simple language-independent features and estimated on a small sample of aligned sentence pairs. Evaluation results show that the model is able predict lexical translation probabilities with a root mean square error of 0.07 (coefficient of determination R2 = 0.47). Based on the scores produced by this regression model, the parallel corpora can be filtered to prune out alignments with a score below a given threshold
Lison, P. & Doğruöz, A.S. (2018) Detecting Machine-translated Documents in Large Parallel Corpora. In Proceedings of the 11th Workshop on Building and Using Comparable Corpora (BUCC 2018).	[BibTex]	[Abstract]
@InProceedings{bucc2018, author= {Pierre Lison and A. Seza Do\u{g}ru{\"o}z}, title = {Detecting Machine-translated Documents in Large Parallel Corpora}, booktitle = {Proceedings of the 11th Workshop on Building and Using Comparable Corpora (BUCC 2018)}, address = {Miyazaki, Japan}, year = {2018} }
Parallel corpora extracted from online repositories of movie and TV subtitles are employed in a wide range of NLP applications, from language modelling to machine translation and dialogue systems. However, the subtitles uploaded in such repositories exhibit varying levels of quality. A particularly difficult problem stems from the fact that a substantial number of these subtitles are not written by human subtitlers but are simply generated through the use of online translation engines. This paper investigates whether these machine-generated subtitles can be detected automatically using a combination of linguistic and extra-linguistic features. We show that a feedforward neural network trained on a small dataset of subtitles can detect machine-generated subtitles with a F1-score of 0.64. Furthermore, applying this detection model on an unlabelled sample of subtitles allows us to provide a statistical estimate for the proportion of subtitles that are machine-translated (or are at least of very low quality) in the full corpus.
Lison, P. & Mavroeidis, V. (2017) Neural Reputation Models learned from Passive DNS Data. In Proceeding of the First International Workshop on Big Data Analytics for Cyber Crime Investigation and Prevention, IEEE Big Data, IEEE.	[BibTex]	[Abstract]
@InProceedings{bigdata2017, author= {Pierre Lison and Vasileios Mavroeidis}, title = {Neural Reputation Models learned from Passive {DNS} Data}, booktitle = {Proceeding of the First International Workshop on Big Data Analytics for Cyber Crime Investigation and Prevention, IEEE Big Data}, publisher = {IEEE}, address = {Boston, USA}, year = {2017} }
Blacklists and whitelists are often employed to filter outgoing and incoming traffic on computer networks. One central function of these lists is to mitigate the security risks posed by malware threats by associating a \textit{reputation} (for instance benign or malicious) to end-point hosts. The creation and maintenance of these lists is a complex and time-consuming process for security experts. As a consequence, blacklists and whitelists are prone to various errors, inconsistencies and omissions, as only a tiny fraction of end-point hosts are effectively covered by the reputation lists. In this paper, we present a machine learning model that is able to automatically detect whether domain names and IP addresses are benign, malicious or sinkholes. The model relies on a deep neural architecture and is trained on a large passive DNS database. Evaluation results demonstrate the effectiveness of the approach, as the model is able to detect malicious DNS records with a F-1 score of 0.96. In other words, the model is able to detect 95 % of the malicious hosts with a false positive rate of 1:1000.
Lison, P. & Mavroeidis, V. (2017) Automatic Detection of Malware-Generated Domains with Recurrent Neural Models. In Norwegian Information Security Conference (NISK 2017), pages 135-146.	[BibTex]	[Abstract]
@InProceedings{nisk2017, author= {Pierre Lison and Vasileios Mavroeidis}, title = {Automatic Detection of Malware-Generated Domains with Recurrent Neural Models}, booktitle = {Norwegian Information Security Conference (NISK 2017)}, address = {Oslo, Norway}, year = {2017}, pages = {135-146},
Modern malware families often rely on domain-generation algorithms (DGAs) to determine rendezvous points to their command-and-control server. Traditional defence strategies (such as blacklisting domains or IP addresses) are inadequate against such techniques due to the large and continuously changing list of domains produced by these algorithms. This paper demonstrates that a machine learning approach based on recurrent neural networks is able to detect domain names generated by DGAs with high precision. The neural models are estimated on a large training set of domains generated by various malwares. Experimental results show that this data-driven approach can detect malware-generated domain names with a F-1 score of 0.971. To put it differently, the model can automatically detect 93 % of malware-generated domain names for a false positive rate of 1:100.
Lison, P. & Bibauw, S. (2017) Not All Dialogues are Created Equal: Instance Weighting for Neural Conversational Models. In Proceedings of the 18th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL 2017), pages 384-394, ACL.	[BibTex]	[Abstract]
@InProceedings{sigdial2017, author= {Pierre Lison and Serge Bibauw}, title = {Not All Dialogues are Created Equal: Instance Weighting for Neural Conversational Models}, booktitle = {Proceedings of the 18th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL 2017)}, publisher = {ACL}, address = {Saarbr{\"u}cken, Germany}, year = {2017}, pages = {384-394} }
Neural conversational models require substantial amounts of dialogue data for their parameter estimation and are therefore usually learned on large corpora such as chat forums or movie subtitles. These corpora are, however, often challenging to work with, notably due to their frequent lack of turn segmentation and the presence of multiple references external to the dialogue itself. This paper shows that these challenges can be mitigated by adding a weighting model into the architecture. The weighting model, which is itself estimated from dialogue data, associates each training example to a numerical weight that reflects its intrinsic quality for dialogue modelling. At training time, these sample weights are included into the empirical loss to be minimised. Evaluation results on retrieval-based models trained on movie and TV subtitles demonstrate that the inclusion of such a weighting model improves the model performance on unsupervised metrics.
Lison, P. & Kennington, C. (2017) Incremental Processing for Neural Conversational Models. In Proceedings of the 21th Workshop on the Semantics and Pragmatics of Dialogue (SemDial 2017), pages 162-163, SemDial.	[BibTex]	[Abstract]
@InProceedings{semdial2017, author= {Pierre Lison and Casey Kennington}, title = {Incremental Processing for Neural Conversational Models}, booktitle = {Proceedings of the 21th Workshop on the Semantics and Pragmatics of Dialogue (SemDial 2017)}, address = {Saarbr{\"u}cken, Germany}, publisher = {SemDial}, year = {2017}, pages = {162-163} }
We present a simple approach to adapt neural conversation models to incremental processing. The approach is validated with a proof-of-concept experiment in a visual reference resolution task.
Lison, P. & Kutuzov, A. (2017) Redefining Context Windows for Word Embedding Models: An Experimental Study. In Proceedings of the 21st Nordic Conference on Computational Linguistics (Nodalida 2017), pages 284-288, Linköping University Electronic Press.	[BibTex]	[Abstract]
@InProceedings{nodalida 2017, author= {Pierre Lison and Andrei Kutuzov}, title = {Redefining Context Windows for Word Embedding Models: An Experimental Study}, booktitle = {Proceedings of the 21st Nordic Conference on Computational Linguistics (Nodalida 2017)}, address = {G{\"o}teborg, Sweden}, publisher = {Link{\"o}ping University Electronic Press}, year = {2017}, pages = {284-288} }
Distributional semantic models learn vector representations of words through the contexts they occur in. Although the choice of context (which often takes the form of a sliding window) has a direct influence on the resulting embeddings, the exact role of this model component is still not fully understood. This paper presents a systematic analysis of context windows based on a set of four distinct hyper-parameters. We train continuous Skip-Gram models on two English-language corpora for various combinations of these hyper-parameters, and evaluate them on both lexical similarity and analogy tasks. Notable experimental results are the positive impact of cross-sentential contexts and the surprisingly good performance of right-context windows.
Lison, P. & Meena, R. (2016) Automatic Turn Segmentation of Movie & TV Subtitles. In Proceedings of the 2016 Spoken Language Technology Workshop, pages 245-252, IEEE.	[BibTex]	[Abstract]
@InProceedings{slt2016, author= {Pierre Lison and Raveesh Meena}, title = {Automatic Turn Segmentation of Movie \& TV Subtitles}, booktitle = {Proceedings of the 2016 Spoken Language Technology Workshop}, address = {San Diego, CA, USA}, publisher = {IEEE}, year = {2016}, pages = {245-252} }
Movie and TV subtitles contain large amounts of conversational material, but lack an explicit turn structure. This paper present a data-driven approach to the segmentation of subtitles into dialogue turns. Training data is first extracted by aligning subtitles with transcripts in order to obtain speaker labels. This data is then used to build a classifier whose task is to determine whether two consecutive sentences are part of the same dialogue turn. The approach relies on linguistic, visual and timing features extracted from the subtitles themselves and does not require access to the audiovisual material -- although speaker diarization can be exploited when audio data is available. The approach also exploits alignments with related subtitles in other languages to further improve the classification performance. The classifier achieves an accuracy of 78 % on a held-out test set. A follow-up annotation experiment demonstrates that this task is also difficult for human annotators.
Stoyanchev, S., Lison, P. & Bangalore, S. (2016) Rapid Prototyping of Form-driven Dialogue Systems Using an Open-source Framework. In Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 216-219, Association for Computational Linguistics.	[BibTex]	[Abstract]
@InProceedings{stoyanchev-lison-bangalore:2016:SIGDIAL, author = {Stoyanchev, Svetlana and Lison, Pierre and Bangalore, Srinivas}, title = {Rapid Prototyping of Form-driven Dialogue Systems Using an Open-source Framework}, booktitle = {Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue}, month = {September}, year = {2016}, address = {Los Angeles}, publisher = {Association for Computational Linguistics}, pages = {216--219}, url = {http://www.aclweb.org/anthology/W16-3626} }
Most human-machine communication for information access through speech, text and graphical interfaces are mediated by forms - i.e. lists of named fields. However, deploying form-filling dialogue systems still remains a challenging task due to the effort and skill required to author such systems. We describe an extension to the OpenDial framework that enables the rapid creation of functional dialogue systems by non-experts. The dialogue designer specifies the slots and their types as input and the tool generates a domain specification that drives a slot-filling dialogue system. The presented approach provides several benefits compared to traditional techniques based on flowcharts, such as the use of probabilistic reasoning and flexible grounding strategies.
Lison, P. & Kennington, C. (2016) OpenDial: A Toolkit for Developing Spoken Dialogue Systems with Probabilistic Rules. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Demonstrations), pages 67-72, Association for Computational Linguistics.	[BibTex]	[Abstract]
@InProceedings{lison-kennington:2016, author = {Pierre Lison and Casey Kennington}, title = {{OpenDial}: A Toolkit for Developing Spoken Dialogue Systems with Probabilistic Rules}, booktitle = {Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Demonstrations)}, year = {2016}, address = {Berlin, Germany}, publisher = {Association for Computational Linguistics}, pages = {67--72} }
We present a new release of OpenDial, an open-source toolkit for building and evaluating spoken dialogue systems. The toolkit relies on an information-state architecture where the dialogue state is represented as a Bayesian network and acts as a shared memory for all system modules. The domain models are specified via probabilistic rules encoded in XML. OpenDial has been deployed in several application domains such as human-robot interaction, intelligent tutoring systems and multi-modal in-car driver assistants.
Lison, P. & Tiedemann, J. (2016) OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016).	[BibTex]	[Abstract]
@inproceedings{opensubtitles2016, author={Pierre Lison and J{\"o}rg Tiedemann}, title={OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles}, booktitle={Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016)}, year= {2016}, location = {Portoro\v{z}, Slovenia} }
We present a new major release of the OpenSubtitles collection of parallel corpora. The release is compiled from a large database of movie and TV subtitles and includes a total of 1689 bitexts spanning 2.6 billion sentences across 60 languages. The release also incorporates a number of enhancements in the preprocessing and alignment of the subtitles, such as the automatic correction of OCR errors and the use of meta-data to estimate the quality of each subtitle and score subtitle pairs.
Dragone, P. & Lison, P. (2015) An Active Learning Approach to the Classification of Non-Sentential Utterances. In Proceedings of the Second Italian Conference on Computational Linguistics, pages 115-119.	[BibTex]	[Abstract]
@inproceedings{nsus_clic2015, author={Paolo Dragone and Pierre Lison}, title = {An {A}ctive {L}earning Approach to the Classification of {N}on-{S}entential {U}tterances}, year={2015}, booktitle={Proceedings of the Second Italian Conference on Computational Linguistics}, pages={115-119}, location={Trento, Italy} }
This paper addresses the problem of classification of non-sentential utterances (NSUs). NSUs are utterances that do not have a complete sentential form but convey a full clausal meaning given the dialogue context. We extend the approach of Fernandez et al. (2007), which provide a taxonomy of NSUs and a small annotated corpus extracted from dialogue transcripts. This paper demonstrates how the combination of new linguistic features and active learning techniques can mitigate the scarcity of labelled data. The results show a significant improvement in the classification accuracy over the state-of-the-art.
Lison, P. & Kennington, C. (2015) Developing Spoken Dialogue Systems with the OpenDial toolkit. In Proceedings of the 19th Workshop on the Semantics and Pragmatics of Dialogue.	[BibTex]	[Abstract]
@inproceedings{semdial2015_opendial, author={Pierre Lison and Casey Kennington}, title={Developing Spoken Dialogue Systems with the OpenDial toolkit}, booktitle={Proceedings of the 19th Workshop on the Semantics and Pragmatics of Dialogue}, year={2015}, location={G\"oteborg, Sweden} }
We present OpenDial, an open-source toolkit for building and evaluating dialogue systems. The toolkit is centered on a dialogue state expressed as a Bayesian network and acting as a shared memory for the system modules. The domain models are specified via probabilistic rules encoded in a simple XML format. The toolkit has been deployed in several applications domains such as human-robot interaction and in-car driver assistants.
Dragone, P. & Lison, P. (2015) Non-sentential utterances in dialogue: experiments in classification and interpretation. In Proceedings of the 19th Workshop on the Semantics and Pragmatics of Dialogue.	[BibTex]	[Abstract]
@inproceedings{semdial2015_nsus, author={Paolo Dragone and Pierre Lison}, title={Non-sentential utterances in dialogue: experiments in classification and interpretation}, booktitle={Proceedings of the 19th Workshop on the Semantics and Pragmatics of Dialogue}, year={2015}, location={G\"oteborg, Sweden} }
We present two ongoing experiments related to the classification and interpretation of non-sentential utterances (NSUs). Extending the work of Fernandez et al. (2007), we first show that the classification performance of NSUs can be improved through the combination of new linguistic features and active learning techniques. We also describe a new, hybrid approach to the semantic interpretation of NSUs based on probabilistic rules.
Kosek, M. & Lison, P. (2014) An Intelligent Tutoring System for Learning Chinese with a Cognitive Model of the Learner. In Proceedings of the EUROCALL 2014 Conference.	[BibTex]	[Abstract]
@inproceedings{michal2014, author= {Micha\l{} Kosek and Pierre Lison}, title = {An Intelligent Tutoring System for Learning Chinese with a Cognitive Model of the Learner}, booktitle = {Proceedings of the EUROCALL 2014 Conference}, year = {2014}}
We present an Intelligent Tutoring System that lets students of Chinese learn words and grammatical constructions. It relies on a Bayesian, linguistically motivated cognitive model that represents the learner's knowledge. This model is dynamically updated given observations about the learner's behaviour in the exercises, and employed at runtime to select the exercises that are expected to maximise the learning outcome. Compared with a baseline that randomly chooses exercises at user's declared level, the system shows positive effects on users' assessment of how much they have learnt, which suggests that it leads to enhanced learning.
Lison, P. (2013) Model-Based Bayesian Reinforcement Learning for Dialogue Management. In Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013).	[BibTex]	[Abstract]
@inproceedings{mbbrldm-plison-is2013, author = {Pierre Lison}, title = {Model-Based Bayesian Reinforcement Learning for Dialogue Management}, booktitle = {Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013)}, year = {2013}}
Reinforcement learning methods are increasingly used to optimise dialogue policies from experience. Most current techniques are model-free: they directly estimate the utility of various actions, without explicit model of the interaction dynamics. In this paper, we investigate an alternative strategy grounded in model-based Bayesian reinforcement learning. Bayesian inference is used to maintain a posterior distribution over the model parameters, reflecting the model uncertainty. This parameter distribution is gradually refined as more data is collected and simultaneously used to plan the agent's actions. Within this learning framework, we carried out experiments with two alternative formalisations of the transition model, one encoded with standard multinomial distributions, and one structured with probabilistic rules. We demonstrate the potential of our approach with empirical results on a user simulator constructed from Wizard-of-Oz data in a human-robot interaction scenario. The results illustrate in particular the benefits of capturing prior domain knowledge with high-level rules.
Lison, P. (2012) Towards Online Planning for Dialogue Management with Rich Domain Knowledge. In Proceedings of the IWSDS'2012 Workshop on Spoken Dialog Systems, Springer.	[BibTex]	[Abstract]
@inproceedings{onlineplanning-iwsds2012, author = {Pierre Lison}, title = {Towards Online Planning for Dialogue Management with Rich Domain Knowledge}, booktitle = {Proceedings of the IWSDS'2012 Workshop on Spoken Dialog Systems}, publisher = {Springer}, location = {Paris, France}, month = {November}, year = {2012}}
Most approaches to dialogue management have so far concentrated on offline optimisation techniques, where a dialogue policy is precomputed for all possible situations and then plugged into the dialogue system. This development strategy has however some limitations in terms of domain scalability and adaptivity, since these policies are essentially static and cannot readily accommodate runtime changes in the environment or task dynamics. In this paper, we follow an alternative approach based on online planning. To ensure that the planning algorithm remains tractable over longer horizons, the presented method relies on probabilistic models expressed via probabilistic rules that capture the internal structure of the domain using high-level representations. We describe in this paper the generic planning algorithm, ongoing implementation efforts and directions for future work.
Lison, P. (2012) Towards Dialogue Management in Relational Domains. In SLTC Workshop on Action, Perception and Language (APL).	[BibTex]	[Abstract]
@inproceedings{relational-apl2012, author = {Pierre Lison}, title = {Towards Dialogue Management in Relational Domains}, booktitle = {SLTC Workshop on Action, Perception and Language (APL)}, location = {Lund, Sweden}, month = {October}, year = {2012}}
Traditional approaches to dialogue management rely on a fixed, predefined set of state variables. For many application domains, the dialogue state is however best described in terms of a collection of varying number of entities and relations holding between them. These entities might correspond to objects, places or persons in the context of the interaction, or represent a set of tasks to perform. Such formalization of the state space is well-suited for many domains, but presents some challenges for the standard probabilistic models used in dialogue management, since these models are propositional in nature and thus unable to directly operate on such state representation.To address this issue, we present an alternative approach based on the use of expressive probabilistic rules that allow for limited forms of universal quantification. These rules take the form of structured mappings between input and output variables, and function as high-level templates for the probability and utility models integrated in the dialogue manager. We present in this abstract the general formalisation of this approach, focusing on the use of universal quantifiers to capture the relational structure of the domain.
Lison, P. (2012) Declarative Design of Spoken Dialogue Systems with Probabilistic Rules. In Proceedings of the 16th Workshop on the Semantics and Pragmatics of Dialogue (SemDial 2012).	[BibTex]	[Abstract]
@inproceedings{declarativedesign-semdial2012, author = {Pierre Lison}, title = {Declarative Design of Spoken Dialogue Systems with Probabilistic Rules}, booktitle = {Proceedings of the 16th Workshop on the Semantics and Pragmatics of Dialogue (SemDial 2012)}, year = {2012}, month={September}, location = {Paris, France}}
Spoken dialogue systems are instantiated in complex architectures comprising multiple interconnected components. These architectures often take the form of pipelines whose components are essentially black-boxes developed and optimised separately, using ad-hoc specification formats for their inputs and outputs, domain models and parameters. We present in this paper an alternative modelling approach, in which the dialogue processing steps (from understanding to management and to generation) are all declaratively specified using the same underlying formalism. The formalism is based on probabilistic rules operating on a shared belief state. These rules are expressed as structured mapping between state variables and provide a compact, probabilistic encoding for the dialogue processing models. We argue that this declarative approach yields several advantages in terms of transparency, domain-portability and adaptivity over traditional black-box architectures. We also describe the implementation and validation of this approach in an integrated architecture for human-robot interaction.
Lison, P. (2012) Probabilistic Dialogue Models with Prior Domain Knowledge. In Proceedings of the SIGDIAL 2012 Conference, pages 179-188, Association for Computational Linguistics.	[BibTex]	[Abstract]
@inproceedings{rulebasedmodels-sigdial2012, Author = {Pierre Lison}, Title = {Probabilistic Dialogue Models with Prior Domain Knowledge}, Booktitle = {Proceedings of the SIGDIAL 2012 Conference}, Month = {July}, Address = {Seoul, South Korea}, Publisher = {Association for Computational Linguistics}, Pages = {179--188}, Year = {2012}}
Probabilistic models such as Bayesian Networks are now in widespread use in spoken dialogue systems, but their scalability to complex interaction domains remains a challenge. One central limitation is that the state space of such models grows exponentially with the problem size, which makes parameter estimation increasingly difficult, especially for domains where only limited training data is available. In this paper, we show how to capture the underlying structure of a dialogue domain in terms of probabilistic rules operating on the dialogue state. The probabilistic rules are associated with a small, compact set of parameters which can be directly estimated from data. We argue that the introduction of this abstraction mechanism yields probabilistic models which are both easier to learn and generalise better than their unstructured counterparts. We empirically demonstrate the benefits of such an approach to learn a dialogue policy for a human-robot interaction domain based on a Wizard-of-Oz data set.
Lison, P. (2011) Multi-Policy Dialogue Management. In Proceedings of the SIGDIAL 2011 Conference, pages 294-300, Association for Computational Linguistics.	[BibTex]	[Abstract]
@InProceedings{lison:2011:SIGDIAL2011, author = {Lison, Pierre}, title = {Multi-Policy Dialogue Management}, booktitle = {Proceedings of the SIGDIAL 2011 Conference}, month = {June}, year = {2011}, address = {Portland, Oregon}, publisher = {Association for Computational Linguistics}, pages = {294--300}}
We present a new approach to dialogue management based on the use of multiple, interconnected policies. Instead of capturing the complexity of the interaction in a single large policy, the dialogue manager operates with a collection of small local policies combined concurrently and hierarchically. The meta-control of these policies relies on an activation vector updated before and after each turn.
Lison, P. & Kruijff, G.J. (2010) Policy activation for open-ended dialogue management. In Proceedings of the AAAI Fall Symposium on Dialog with Robots.	[BibTex]	[Abstract]
@inproceedings{policyactivation-aaai2010, Author = {Pierre Lison and Geert-Jan Kruijff}, Booktitle = {Proceedings of the AAAI Fall Symposium on Dialog with Robots}, Title = {Policy activation for open-ended dialogue management}, Year = {2010}}
An important difficulty in developing spoken dialogue systems for robots is the open-ended nature of most interactions. Robotic agents must typically operate in complex, continuously changing environments which are difficult to model and do not provide any clear, predefined goal. Directly capturing this complexity in a single, large dialogue policy is thus inadequate. This paper presents a new approach which tackles the complexity of open-ended interactions by breaking it into a set of small, independent policies, which can be activated and deactivated at runtime by a dedicated mechanism. The approach is currently being implemented in a spoken dialogue system for autonomous robots.
Lison, P. (2010) Towards Relational POMDPs for Adaptive Dialogue Management. In Proceedings of the ACL 2010 Student Research Workshop, pages 7-12, Association for Computational Linguistics.	[BibTex]	[Abstract]
@InProceedings{lison:2010:SRW, author = {Lison, Pierre}, title = {Towards Relational {P}{O}{M}{D}{P}s for Adaptive Dialogue Management}, booktitle = {Proceedings of the ACL 2010 Student Research Workshop}, month = {July}, year = {2010}, address = {Uppsala, Sweden}, publisher = {Association for Computational Linguistics}, pages = {7--12}}
Open-ended spoken interactions are typically characterised by both structural complexity and high levels of uncertainty, making dialogue management in such settings a particularly challenging problem. Traditional approaches have focused on providing theoretical accounts for either the uncertainty or the complexity of spoken dialogue, but rarely considered the two issues in tandem. This paper describes ongoing work on a new approach to dialogue management which attempts to fill this gap. We represent the interaction as a Partially Observable Markov Decision Process (POMDP) over a rich state space incorporating both dialogue, user, and environment models. The tractability of the resulting POMDP can be preserved using a mechanism for dynamically constraining the action space based on prior knowledge over locally relevant dialogue structures. These constraints are encoded in a small set of general rules expressed as a Markov Logic network. The first-order expressivity of Markov Logic enables us to leverage the rich relational structure of the problem and efficiently abstract over large regions of the state and action spaces.
Lison, P., Ehrler, C. & Kruijff, G.J. (2010) Belief Modelling for Situation Awareness in Human-Robot Interaction. In Proceedings of the 19th International Symposium on Robot and Human Interactive Communication (RO-MAN 2010).	[BibTex]	[Abstract]
@inproceedings{roman2010-beliefs, Author = {Pierre Lison and Carsten Ehrler and Geert-Jan Kruijff}, Booktitle = {Proceedings of the 19th International Symposium on Robot and Human Interactive Communication (RO-MAN 2010)}, Location = {Viareggio, Italy}, Title = {Belief Modelling for Situation Awareness in Human-Robot Interaction}, Year = {2010}}
To interact naturally with humans, robots needs to be aware of their own surroundings. This awareness is usually encoded in some implicit or explicit representation of the situated context. In this paper, we present a new framework for constructing rich belief models of the robot's environment. Key to our approach is the use of Markov Logic as a unified representation formalism. Markov Logic is a combination of first-order logic and probabilistic graphical models. Its expressive power allows us to capture both the rich relational structure of the environment and the uncertainty arising from the noise and incompleteness of low-level sensory data. The constructed belief models evolve dynamically over time and incorporate various contextual information such as spatio-temporal framing, multi-agent epistemic status, and saliency measures. Beliefs can also be referenced and extended ``top-down'' via linguistic communication. The approach is being integrated into a cognitive architecture for mobile robots interacting with humans using spoken dialogue.
Kruijff, G.J., Janiček, M. & Lison, P. (2010) Continual Processing of Situated Dialogue in Human-Robot Collaborative Activities. In Proceedings of the 19th International Symposium on Robot and Human Interactive Communication (RO-MAN 2010).	[BibTex]	[Abstract]
@inproceedings{roman2010-cca, Author = {Geert-Jan Kruijff and Miroslav Jani\v{c}ek and Pierre Lison}, Booktitle = {Proceedings of the 19th International Symposium on Robot and Human Interactive Communication (RO-MAN 2010)}, Location = {Viareggio, Italy}, Title = {Continual Processing of Situated Dialogue in Human-Robot Collaborative Activities}, Year = {2010}}
The paper presents an implemented approach of processing situated dialogue between a human and a robot. The focus is on task-oriented dialogue, set in the larger context of human-robot collaborative activity. The approach models understanding and production of dialogue to include intension (what is being talked about), intention (the goal of why something is being said), and attention (what is being focused on). These dimensions are directly construed in terms of assumptions and assertions on situated multi-agent belief models. The approach is continual in that it allows for interpretations to be dynamically retracted, revised, or deferred. This makes it possible to deal with the inherent asymmetry in how robots and humans tend to understand dialogue, and the world it is set in. The approach has been fully implemented, and integrated into a cognitive robot. The paper discusses the implementation, and illustrates it in a collaborative learning setting.
Skočaj, D., Kristan, M., Leonardis, A., Vrečko, A., Janiček, M., Kruijff, G.J., Lison, P. & Zillich, M. (2010) A system approach to interactive learning of visual concepts. In Tenth International Conference on Epigenetic Robotics.	[BibTex]	[Abstract]
@inproceedings{george-epirob2010, Author = {Danijel Sko\v{c}aj and Matej Kristan and Ale\v{s} Leonardis and Alen Vre\v{c}ko and Miroslav Jani\v{c}ek and Geert-Jan Kruijff and Pierre Lison and Michael Zillich}, Booktitle = {Tenth International Conference on Epigenetic Robotics}, Title = {A system approach to interactive learning of visual concepts}, Year = {2010}}
In this work we present a system and underlying representations and mechanisms for continuous learning of visual concepts in dialogue with a human tutor.
Skočaj, D., Kristan, M., Leonardis, A., Vrečko, A., Janiček, M., Kruijff, G.J., Lison, P. & Zillich, M. (2010) A basic cognitive system for interactive learning of simple visual concepts. In RSS Workshop on Learning for Human-Robot Interaction Modeling.	[BibTex]	[Abstract]
@inproceedings{george-rss2010, location = {Zaragoza, Spain}, Author = {Danijel Sko\v{c}aj and Matej Kristan and Ale\v{s} Leonardis and Alen Vre\v{c}ko and Miroslav Jani\v{c}ek and Geert-Jan Kruijff and Pierre Lison and Michael Zillich}, Booktitle = {RSS Workshop on Learning for Human-Robot Interaction Modeling}, Month = {June}, Title = {A basic cognitive system for interactive learning of simple visual concepts}, Year = {2010}}
In this work we present a system and underlying representations and mechanisms for continuous learning of visual concepts in dialogue with a human tutor.
Skočaj, D., Janiček, M., Kristan, M., Kruijff, G.J., Leonardis, A., Lison, P., Vrečko, A. & Zillich, M. (2010) A basic cognitive system for interactive continuous learning of visual concepts. In Proceedings of the workshop on Interactive Communication for Autonomous Intelligent Robots, ICRA 2010, pages 30-36.	[BibTex]	[Abstract]
@inproceedings{george-icair2010, location = {Anchorage, AK, USA}, Author = {Danijel Sko\v{c}aj and Miroslav Jani\v{c}ek and Matej Kristan and Geert-Jan Kruijff and Ale\v{s} Leonardis and Pierre Lison and Alen Vre\v{c}ko and Michael Zillich}, Booktitle = {Proceedings of the workshop on Interactive Communication for Autonomous Intelligent Robots, ICRA 2010}, Month = {May}, Pages = {30-36}, Title = {A basic cognitive system for interactive continuous learning of visual concepts}, Year = {2010}}
Interactive continuous learning is an important characteristic of a cognitive agent that is supposed to operate and evolve in an ever changing environment. In this paper we present representations and mechanisms that are necessary for continuous learning of visual concepts in dialogue with a tutor. We present an approach for modelling beliefs stemming from multiple modalities and we show how these beliefs are created by processing visual and linguistic information and how they are used for learning. We also present a system that exploits these representations and mechanisms, and demonstrate these principles in the case of learning about object colours and basic shapes in dialogue with the tutor.
Lison, P. (2009) Robust processing of situated spoken dialogue. In Von der Form zur Bedeutung: Texte automatisch verarbeiten / From Form to Meaning: Processing Texts Automatically, Narr Verlag. (Proceedings of the Biennial GSCL Conference 2009 , Potsdam, Germany).	[BibTex]	[Abstract]
@inproceedings{plison.robustprocessing.gscl2009, Author = {Pierre Lison}, Booktitle = {Von der Form zur Bedeutung: Texte automatisch verarbeiten / From Form to Meaning: Processing Texts Automatically}, Editor = {Christian Chiarcos and Richard Eckart de Castilho and Manfred Stede}, Note = {Proceedings of the Biennial GSCL Conference 2009 , Potsdam, Germany}, Publisher = {Narr Verlag}, Title = {Robust processing of situated spoken dialogue}, Year = {2009}}
Spoken dialogue is notoriously hard to process with standard language processing technologies. Dialogue systems must indeed meet two major challenges. First, natural spoken dialogue is replete with disfluent, partial, elided or ungrammatical utterances. Second, speech recognition remains a highly error-prone task, especially for complex, open-ended domains. We present an integrated approach for addressing these two issues, based on a robust incremental parser. The parser takes word lattices as input and is able to handle ill-formed and misrecognised utterances by selectively relaxing its set of grammatical rules. The choice of the most relevant interpretation is then realised via a discriminative model augmented with contextual information. The approach is fully implemented in a dialogue system for autonomous robots. Evaluation results on a Wizard of Oz test suite demonstrate very significant improvements in accuracy and robustness compared to the baseline.
Lison, P. & Kruijff, G.J. (2009) Robust processing of situated spoken dialogue. In KI 2009: Advances in Artificial Intelligence. Proceedings of the 32nd Annual German Conference on AI, Springer Verlag.	[BibTex]	[Abstract]
@inproceedings{plison-kruijff.robustprocessing.ki2009, location = {Paderborn, Germany}, Author = {Pierre Lison and Geert-Jan Kruijff}, Booktitle = {KI 2009: Advances in Artificial Intelligence. Proceedings of the 32nd Annual German Conference on AI}, Editors = {B\"{a}rbel Mertsching and Marcus Hund and Zaheer Aziz}, Publisher = {Springer Verlag}, Series = {Lecture Notes in Artificial Intelligence , Vol. 5803}, Title = {Robust processing of situated spoken dialogue}, Year = {2009}}
Spoken dialogue is notoriously hard to process with standard language processing technologies. Dialogue systems must indeed meet two major challenges. First, natural spoken dialogue is replete with disfluent, partial, elided or ungrammatical utterances. Second, speech recognition remains a highly error-prone task, especially for complex, open-ended domains. We present an integrated approach for addressing these two issues, based on a robust incremental parser. The parser takes word lattices as input and is able to handle ill-formed and misrecognised utterances by selectively relaxing its set of grammatical rules. The choice of the most relevant interpretation is then realised via a discriminative model augmented with contextual information. The approach is fully implemented in a dialogue system for autonomous robots. Evaluation results on a Wizard of Oz test suite demonstrate very significant improvements in accuracy and robustness compared to the baseline.
Lison, P. (2009) A Method to Improve the Efficiency of Deep Parsers with Incremental Chart Pruning. In Proceedings of the ESSLLI Workshop on Parsing with Categorial Grammars.	[BibTex]	[Abstract]
@inproceedings{plison.chartpruning.cgparsing2009, location = {Bordeaux, France}, Author = {Pierre Lison}, Booktitle = {Proceedings of the ESSLLI Workshop on Parsing with Categorial Grammars}, Title = {A Method to Improve the Efficiency of Deep Parsers with Incremental Chart Pruning}, Year = {2009}}
The use of deep parsers in spoken dialogue systems is usually subject to strong performance requirements. Real-time dialogue applications must be capable of responding quickly to any given utterance, even in the presence of noisy, ambiguous or distorted input. The parser must therefore ensure that the number of analyses remains bounded at every processing step. The paper presents a practical approach to address this issue in the context of deep parsers designed for spoken dialogue. The approach is based on a word lattice parser for Combinatory Categorial Grammar combined with a discriminative model for parse selection. Each word lattice is parsed incrementally, word by word, and a discriminative model is applied at each incremental step to prune the set of resulting partial analyses. The model incorporates a wide range of linguistic and contextual features and can be trained with a simple perceptron. The approach is fully implemented as part of a spoken dialogue system for human-robot interaction. Evaluation results on a Wizard-of-Oz test suite demonstrate significant improvements in parsing time.
Lison, P. & Kruijff, G.J.M. (2009) An Integrated Approach to Robust Processing of Situated Spoken Dialogue. In Proceedings of SRSL 2009, the 2nd Workshop on Semantic Representation of Spoken Language, pages 58-65, Association for Computational Linguistics.	[BibTex]	[Abstract]
@InProceedings{lison-kruijff:2009:SRSL, author = {Lison, Pierre and Kruijff, Geert-Jan M.}, title = {An Integrated Approach to Robust Processing of Situated Spoken Dialogue}, booktitle = {Proceedings of SRSL 2009, the 2nd Workshop on Semantic Representation of Spoken Language}, month = {March}, year = {2009}, address = {Athens, Greece}, publisher = {Association for Computational Linguistics}, pages = {58--65}}
Spoken dialogue is notoriously hard to process with standard NLP technologies. Natural spoken dialogue is replete with disfluent, partial, elided or ungrammatical utterances, all of which are difficult to accommodate in a dialogue system. Furthermore, speech recognition is known to be a highly error-prone task, especially for complex, open-ended domains. The combination of these two problems -- ill-formed and/or misrecognised speech inputs -- raises a major challenge to the development of robust dialogue systems. We present an integrated approach for addressing these two issues, based on an incremental parser for Combinatory Categorial Grammar. The parser takes word lattices as input and is able to handle ill-formed and misrecognised utterances by selectively relaxing its set of grammatical rules. The choice of the most relevant interpretation is then realised via a discriminative model augmented with contextual information. The approach is fully implemented in a dialogue system for autonomous robots. Evaluation results on a Wizard of Oz test suite demonstrate very significant improvements in accuracy and robustness compared to the baseline.
Lison, P. & Kruijff, G.J. (2008) Salience-driven Contextual Priming of Speech Recognition for Human-Robot Interaction. In Proceedings of the 18th European Conference on Artificial Intelligence (ECAI 2008).	[BibTex]	[Abstract]
@inproceedings{Lison/Kruijff:2008, location = {Patras, Greece}, Author = {Pierre Lison and Geert-Jan Kruijff}, Booktitle = {Proceedings of the 18th European Conference on Artificial Intelligence (ECAI 2008)}, Keywords = {human-robot interaction, speech recognition, statistical language models, salience modeling, cognitive systems}, Title = {Salience-driven Contextual Priming of Speech Recognition for Human-Robot Interaction}, Year = {2008}}
The paper presents an implemented model for priming speech recognition, using contextual information about salient entities. The underlying hypothesis is that, in human-robot interaction, speech recognition performance can be improved by exploiting knowledge about the immediate physical situation and the dialogue history. To this end, visual salience (objects perceived in the physical scene) and linguistic salience (objects, events already mentioned in the dialogue) are integrated into a single cross-modal salience model.The model is dynamically updated as the environment changes. It is used to establish expectations about which words are most likely to be heard in the given context. The update is realised by continuously adapting the word-class probabilities specified in a statistical language model.The paper discusses the motivations behind the approach, and presents the implementation as part of a cognitive architecture for mobile robots. Evaluation results on a test suite show a statistically significant improvement of salience-driven priming speech recognition (WER) over a commercial baseline system.
Lison, P. (2008) A salience-driven approach to speech recognition for human-robot interaction. In Proceedings of the 13th ESSLLI student session (ESSLLI 2008).	[BibTex]	[Abstract]
@inproceedings{ESSLLI2008, location = {Hamburg, Germany}, Author = {Pierre Lison}, Booktitle = {Proceedings of the 13th ESSLLI student session (ESSLLI 2008)}, Keywords = {human-robot interaction, speech recognition, statistical language models, salience modeling, cognitive systems}, Title = {A salience-driven approach to speech recognition for human-robot interaction}, Year = {2008}}
We present an implemented model for speech recognition in natural environments which relies on contextual information about salient entities to prime utterance recognition. The hypothesis underlying our approach is that, in situated human-robot interactions, the speech recognition performance can be significantly enhanced by exploiting knowledge about the immediate physical environment and the dialogue history. To this end, visual salience (objects perceived in the physical scene) and linguistic salience (previously referred expressions within the current dialogue) are integrated into a single cross-modal salience model. The model is dynamically updated as the environment evolves, and is used to establish expectations about uttered words which are most likely to be heard given the context. The update is realised by continously adapting the word-class probabilities specified in the statistical language model. The present article discusses the motivations behind our approach, describes our implementation as part of a distributed, cognitive architecture for mobile robots, and reports the evaluation results on a test suite.
Kruijff, G.J., Lison, P., Benjamin, T., Jacobsson, H. & Hawes, N. (2007) Incremental, multi-level processing for comprehending situated dialogue in human-robot interaction. In Proceedings of the Symposium on Language and Robots (LangRo'2007).	[BibTex]	[Abstract]
@inproceedings{aveiro07, location = {Aveiro, Portugal}, Author = {Geert-Jan Kruijff and Pierre Lison and Trevor Benjamin and Henrik Jacobsson and Nick Hawes}, Booktitle = {Proceedings of the Symposium on Language and Robots (LangRo'2007)}, Title = {Incremental, multi-level processing for comprehending situated dialogue in human-robot interaction}, Year = {2007}}
The paper presents work in progress on an implemented model of situated dialogue processing. The underlying assumption is that to understand situated dialogue, communicated meaning needs to be related to situation(s) it refers to. The model couples incremental processing to a notion of bidirectional connectivity, inspired by how humans process visually situated language. Analyzing an utterance in a ``word by word'' fashion, a representation of possible utterance interpretations is gradually built up. In a top-down fashion, the model tries to ground these interpretations in situation awareness, through which they can prime what is focused on in a situation. In a bottom-up fashion, the (im)possibility to ground certain interpretations primes how the analysis of the utterance further unfolds. The paper discusses the implementation of the model in a distributed, cognitive architecture for human-robot interaction, and presents an evaluation on a test suite. The evaluation shows (and quantifies) the effects linguistic interpretation have on priming incremental utterance processing, and discusses how such evaluation can be extended to include situation-relative interpretation.

Books and book chapters

Lison, P. and Specia, L. (2022) Dis, c'est quoi l'intelligence artificielle? Renaissance du Livre.	[BibTex]
@Book{DCQ_IA, author = {Pierre Lison and Luc Specia}, title = {Dis, c'est quoi l'intelligence artificielle?}, month = {February}, year = {2022}, address = {Waterloo, Belgium}, publisher = {Renaissance du Livre}}

Kruijff, G.J., Lison, P., Benjamin, T., Jacobsson, H., Zender, H. & Kruijff-Korbayová, I. (2010) Situated Dialogue Processing for Human-Robot Interaction. In Cognitive Systems, 8, Springer Verlag.	[BibTex]
@inbook{cosybook:dialogue, location = {Heidelberg, Germany}, Author = {Geert-Jan Kruijff and Pierre Lison and Trevor Benjamin and Henrik Jacobsson and Hendrik Zender and Ivana Kruijff-Korbayov\'{a}}, title = {Situated Dialogue Processing for Human-Robot Interaction}, Editor = {Christensen, Henrik Iskov and Sloman, Aaron and Kruijff, Geert-Jan M. and Wyatt, Jeremy L.}, Month = {July}, Publisher = {Springer Verlag}, Series = {Cognitive Systems Monographs}, booktitle = {Cognitive Systems}, Volume = {8}, Year = {2010}}

Lison, P. (2010) A salience-driven approach to speech recognition for human-robot interaction. In Thomas Icard; Reinhard Muskens: Interfaces: Explorations in Logic, Language and Computation, pages 102-113, Springer Verlag. (extended reprint of the 2008 ESSLLI paper).	[BibTex]	[Abstract]
@incollection{ESSLLI2008-springerreprint, Author = {Pierre Lison}, Booktitle = {Thomas Icard; Reinhard Muskens: Interfaces: Explorations in Logic, Language and Computation}, Keywords = {speech recognition, human-robot interaction, spoken dialogue systems}, Note = {extended reprint of the 2008 ESSLLI paper}, Pages = {102-113}, Publisher = {Springer Verlag}, Title = {A salience-driven approach to speech recognition for human-robot interaction}, Year = {2010}}
We present an implemented model for speech recognition in natural environments which relies on contextual information about salient entities to prime utterance recognition. The hypothesis underlying our approach is that, in situated human-robot interactions, the speech recognition performance can be significantly enhanced by exploiting knowledge about the immediate physical environment and the dialogue history. To this end, visual salience (objects perceived in the physical scene) and linguistic salience (previously referred expressions within the current dialogue) are integrated into a single cross-modal salience model. The model is dynamically updated as the environment evolves, and is used to establish expectations about uttered words which are most likely to be heard given the context. The update is realised by continuously adapting the word-class probabilities specified in the statistical language model. The present article discusses the motivations behind our approach, describes our implementation as part of a distributed, cognitive architecture for mobile robots, and reports the evaluation results on a test suite.

PhD thesis and master theses

Lison, P. (2014) Structured Probabilistic Modelling for Dialogue Management. PhD thesis, University of Oslo.	[BibTex]	[Abstract]
@phdthesis{LisonThesis2014, Author = {Pierre Lison}, Month = {February}, School = {University of Oslo}, Title = {{S}tructured {P}robabilistic {M}odelling for {D}ialogue {M}anagement}, Year = {2014}}
This thesis presents a new modelling framework for dialogue management based on the concept of probabilistic rules. Probabilistic rules are defined as if...then...else constructions associating logical conditions on input variables to probabilistic effects over output variables. These rules function as high-level templates for the generation of a directed graphical model. Their expressive power allows them to represent the probabilistic models employed in dialogue management in a compact and efficient manner. As a consequence, they can drastically reduce the amount of interaction data required for parameter estimation as well as enhance the system's ability to generalise over unseen situations. Furthermore, probabilistic rules can also be exploited to encode domain-specific constraints and assumptions into statistical models of dialogue, thereby enabling system designers to incorporate their expert knowledge of the problem structure in a concise and human-readable form. Due to their integration of logical and probabilistic reasoning, we argue that probabilistic rules are particularly well suited to devise hybrid models of dialogue management that can account for both the complexity and uncertainty that characterise many dialogue domains. The thesis also demonstrates how the parameters of probabilistic rules can be efficiently estimated using both supervised and reinforcement learning techniques. In the case of supervised learning, the rule parameters are learned by imitation on the basis of small amounts of Wizard-of-Oz data. Alternatively, rule parameters can also be optimised via trial and error from repeated interactions with a (real or simulated) user. Both learning strategies rely on Bayesian inference to iteratively estimate the parameter values and provide the best fit for the observed interaction data. Three consecutive experiments conducted in a human-robot interaction domain attest to the practical viability of the proposed framework and its advantages over traditional approaches. In particular, the empirical results of a user evaluation with 37 participants show that a dialogue manager structured with probabilistic rules outperforms both purely hand-crafted and purely statistical methods on an extensive range of subjective and objective metrics of dialogue quality. The modelling framework presented in this thesis is implemented in a new software toolkit called OpenDial, which is made freely available to the research community and can be used to develop various types of dialogue systems based on probabilistic rules.
Lison, P. (2008) Robust Processing of Situated Spoken Dialogue. Master's thesis, Universität des Saarlandes.	[BibTex]	[Abstract]
@mastersthesis{LisonThesis2008, location = {Saarbr\"{u}cken}, Author = {Pierre Lison}, Month = {December}, School = {Universit\"{a}t des Saarlandes}, Title = {Robust Processing of Situated Spoken Dialogue}, Year = {2008}}
Spoken dialogue is often considered as one of the most natural means of interaction between a human and a machine. It is, however, notoriously hard to process using NLP technology. As many corpus studies have shown, natural spoken dialogue is replete with disfluent, partial, elided or ungrammatical utterances, all of which are very hard to accommodate in a dialogue system. Furthermore, automatic speech recognition [ASR] is known to be a highly error-prone task, especially when dealing with complex, open-ended discourse domains. The combination of these two problems -- ill-formed and/or misrecognised speech inputs -- raises a major challenge to the development of robust dialogue systems. This thesis presents an integrated approach for addressing these issues in the context of domain-specific dialogues for human-robot interaction [HRI]. Several new techniques and algorithms have been developed to this end. They can be divided into two main lines of work. The first line of work pertains to speech recognition. We describe a new model for context-sensitive speech recognition, which is specifically suited to HRI. The underlying hypothesis is that, in situated human-robot interaction, ASR performance can be significantly improved by exploiting contextual knowledge about the physical environment (objects perceived in the visual scene) and the dialogue history (previously referred-to objects within the current dialogue). The language model is dynamically updated as the environment changes, and is used to establish expectations about uttered words which are most likely to be heard given the context. The second line of work deals with the robust parsing of spoken inputs. We present a new approach for this task, based on a incremental parser for Combinatory Categorial Grammar [CCG]. The parser takes word lattices as input and is able to handle ill-formed and misrecognised utterances by selectively relaxing and extending its set of grammatical rules. This operation is done via the introduction of non-standard CCG rules into the grammar. The choice of the most relevant interpretation is then realised via a discriminative model augmented with contextual information. The model includes a broad range of linguistic and contextual features, and can be trained with a simple perceptron algorithm. All the algorithms presented in this thesis are fully implemented, and integrated as part of a distributed cognitive architecture for autonomous robots. We performed an extensive evaluation of our approach using a set of Wizard of Oz experiments. The obtained results demonstrate very significant improvements in accuracy and robustness compared to the baseline.
Lison, P. (2006) Implémentation d'une Interface Sémantique-Syntaxe basée sur des Grammaires d'Unification Polarisées. Master's thesis, Université Catholique de Louvain, Louvain-la-Neuve, Belgium.	[BibTex]	[Abstract]
@mastersthesis{LisonMscThesis06, Author = {Pierre Lison}, Keywords = {dependency grammar, constraint programming, meaning-text unification grammars, polarized unification grammars, semantics-syntax interface, extensible dependency grammar}, School = {Universit{\'e} Catholique de Louvain, Louvain-la-Neuve, Belgium}, Title = {Impl{\'e}mentation d'une Interface S{\'e}mantique-Syntaxe bas{\'e}e sur des Grammaires d'Unification Polaris{\'e}es}, Year = {2006}}
This work relates to Natural Language Processing [NLP], a scientific research field situated at the intersection of several classical disciplines such as computer science, linguistics, mathematics, psychology, and whose object is the design of computational systems able to process (i.e. understand and/or generate) linguistic data, whether oral or written. In order to achieve that goal, it is often necessary to design formal models able to simulate the behaviour of complex linguistic phenomena. Several theories have been elaborated to this end. Significant divergences do exist between them concerning linguistic foundations as well as grammatical formalisms and related computer tools. Nevertheless, many efforts have recently been made to bring them closer together, and two major trends clearly seem to emerge from the main contemporary theories: They are all built around modular architectures, explicitly distinguishing the semantic, syntactic, morphological and phonological representation levels ; They al give a central position to the lexicon, rightly seen as a crucial resource for the establishment of efficient and wide-coverage systems This study examines an essential component of all these models: the semantics-syntax interface, responsible for the mapping between the semantic and syntactic levels of the architecture. Indeed, many distortion phenomenas can be found in every human language between these two levels. Let us mention as examples the handling of idioms and locutions, the active/passive alternation, the so-called "extraction" phenomenas (relative subordinates, interrogative clauses), elliptic coordination, and many others. We approach this issue in the framework of a particular linguistic theory, the Meaning-Text Unification Grammars [MPUG], an articulated mathematical model of language recently device by S. Kahane, and his related description formalism, Polarized Unification Grammars [PUG]. The first part of our work deals with the general study of the role and inner workings of the semantics-syntax interface within this theory. We then propose a concrete implementation of it based on Constraint Programming. This implementation is grounded on an axiomatization of our initial formalism into a Constraint Satisfaction Problem. Rather than developing the software entirely from scratch, we have instead chosen to reuse an existing tool, the XDG Development Kit, and to adapt it to our needs. Its is a grammar development environment for the meta-grammatical formalism of Extensible Dependency Grammar [XDG], entirely based on Constraint Programming. Practically, this work makes three original contributions to NLP research: An axiomatization of MTUG/PUG into a Constraint Satisfaction Problem, enabling us to give a solid formal ground to our implementation ; An implementation of our semantics-syntax interface by means of a compiler from MTUG/PUG grammars to XDG grammars called auGUSTe as well as by the integration of eight new "principles" (i.e. constraints sets) into XDG ; And finally, the application of our compiler to a small hand-crafted grammar centered on culinary vocabulary in order to experimentally validate our work.

Others (published volumes, technical reports)

Lison, P., Nilsson, M. & Recasens, M. (eds). (2012) Proceedings of the Student Research Workshop at the 13th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics.	[BibTex]
@Book{SRWEACL2012:2012, editor = {Pierre Lison and Mattias Nilsson and Marta Recasens}, title = {Proceedings of the Student Research Workshop at the 13th Conference of the European Chapter of the Association for Computational Linguistics}, month = {April}, year = {2012}, address = {Avignon, France}, publisher = {Association for Computational Linguistics}}

Wyatt, J., Kruijff, G.J., Lison, P., Zillich, M., Thomas Mörwald, K.Z., Brenner, M., Gretton, C., Jensfelt, P., Sjöö;, K., Pronobis, A., Kristan, M., Mahnič, M. & Skočaj, D. (2010) Unifying representations of beliefs about beliefs and knowledge producing actions. Technical Report. (WP 1, year 2 deliverable, 134 pages).	[BibTex]	[Abstract]
@techreport{wp1-dr2010, Author = {Jeremy Wyatt and Geert-Jan Kruijff and Pierre Lison and Michael Zillich and Thomas M\"{o}rwald, Kai Zhou and Michael Brenner and Charles Gretton and Patric Jensfelt and Kristoffer Sj\"{o}\"{o} and Andzrej Pronobis and Matej Kristan and Marko Mahni\v{c} and Danijel Sko\v{c}aj}, Institution = {CogX Project}, Note = {WP 1, year 2 deliverable, 134 pages}, Title = {Unifying representations of beliefs about beliefs and knowledge producing actions}, Year = {2010}}
Representing the epistemic state of the robot and how that epistemic state changes under action is one of the key tasks in CogX. In this report we describe progress on this in the first 18 months of the project, and set out a typology of epistemic knowledge. We describe the specific representations we have developed for different domains or modalities, or are planning to develop, and how those are related to one another.
Kruijff, G.J., Janiček, M., Kruijff-Korbayová, I., Lison, P., Meena, R. & Zender, H. (2009) Transparency in situated dialogue for interactive learning. Technical Report. (WP 6, year 1 deliverable, 38 pages).	[BibTex]	[Abstract]
@techreport{wp6-dr2009, Author = {Geert-Jan Kruijff and Miroslav Jani\v{c}ek and Ivana Kruijff-Korbayov\'{a} and Pierre Lison and Raveesh Meena and Hendrik Zender}, Institution = {CogX Project}, Month = {July}, Note = {WP 6, year 1 deliverable, 38 pages}, Title = {Transparency in situated dialogue for interactive learning}, Year = {2009}}
A robot can use dialogue to try to learn more about the world. For this to work, the robot and a human need to establish a mutually agreed-upon understanding of what is being talked about, and why. Thereby it is particularly important for the human to understand what the robot is after. The notion of \emph{transparency} tries to capture this. It involves the relation between why a question is asked, how it relates to private and shared beliefs, and how it reveals what the robot does or does not know. For year 1, WP6 investigated means for establishing transparency in situated dialogue for interactive learning. This covered two aspects: how to phrase questions for knowledge gathering and -refinement, and how to verbalize knowledge. Results include methods for verbalizing what the robot does and does not know about referents and aspects of the environment, based on a mixture of prior and autonomously acquired knowledge and basic methods for self-understanding (Task 6.1); and, novel algorithms for determining content and context for question subdialogues to gather more information to help resolve misunderstandings or fill gaps (Task 6.2). WP6 also reports results on making spoken situated dialogue more robust, employing probabilistic models for using multi-modal information to reduce uncertainty in comprehension.
Lison, P. (2004) Aux sources de l'inspiration. Revue Louvain(145):14-15. (Dossier thématique 'Comment apprendre la paix').	[BibTex]
@article{louvain04, Author = {Pierre Lison}, Journal = {Revue Louvain}, Month = {March}, Note = {Dossier th\'{e}matique 'Comment apprendre la paix'}, Number = {145}, Pages = {14-15}, Title = {Aux sources de l'inspiration}, Year = {2004}}