Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5...8

Hits 1 – 20 of 141

1	Preparing Legal Documents for NLP Analysis: Improving the Classification of Text Elements by Using Page Features
	Josi, Frieda; Wartena, Christian (Prof. Dr.); Heid, Ulrich. - : AIRCC Publishing Corporation, 2022. : Hannover : Hochschule Hannover, 2022
	Abstract: Legal documents often have a complex layout with many different headings, headers and footers, side notes, etc. For the further processing, it is important to extract these individual components correctly from a legally binding document, for example a signed PDF. A common approach to do so is to classify each (text) region of a page using its geometric and textual features. This approach works well, when the training and test data have a similar structure and when the documents of a collection to be analyzed have a rather uniform layout. We show that the use of global page properties can improve the accuracy of text element classification: we first classify each page into one of three layout types. After that, we can train a classifier for each of the three page types and thereby improve the accuracy on a manually annotated collection of 70 legal documents consisting of 20,938 text elements. When we split by page type, we achieve an improvement from 0.95 to 0.98 for single-column pages with left marginalia and from 0.95 to 0.96 for double-column pages. We developed our own feature-based method for page layout detection, which we benchmark against a standard implementation of a CNN image classifier. The approach presented here is based on corpus of freely available German contracts and general terms and conditions. Both the corpus and all manual annotations are made freely available. The method is language agnostic.
	Keyword: Automatische Klassifikation; Bilderkennung; ddc:020; Dokumentanalyse; Maschinelles Lernen; Rechtswissenschaften; Sachtext; Text Mining
	URL: https://serwiss.bib.hs-hannover.de/files/2161/csit120102.pdf https://serwiss.bib.hs-hannover.de/frontdoor/index/index/docId/2161 http://nbn-resolving.org/urn:nbn:de:bsz:960-opus4-21618 https://doi.org/10.25968/opus-2161 https://nbn-resolving.org/urn:nbn:de:bsz:960-opus4-21618
	BASE
	Hide details

2	Representing Standard Text Formulations as Directed Graphs
	Josi, Frieda; Wartena, Christian (Prof. Dr.); Heid, Ulrich (Prof. Dr.). - : Cham : Springer, 2021. : Hannover : Hochschule Hannover, 2021
	BASE
	Show details

3	The cooccurrence of linguistic structures
	Proisl, Thomas [Verfasser]; Evert, Stefan [Akademischer Betreuer]; Evert, Stefan [Gutachter]. - Erlangen : FAU University Press, 2019
	DNB Subject Category Language
	Show details

4	Detecting Paraphrases of Standard Clause Titles in Insurance Contracts
	Heid, Ulrich; Wartena, Christian (Prof. Dr.); Josi, Frieda. - : Hannover : Hochschule Hannover, 2019
	BASE
	Show details

5	Representing human and machine dictionaries in markup languages (SGML, XML)
	Witt, Andreas [Verfasser]; Romary, Laurent [Verfasser]; Schweickard, Wolfgang [Herausgeber]. - Mannheim : Institut für Deutsche Sprache, Bibliothek, 2018
	DNB Subject Category Language
	Show details

6	A taxonomy of user guidance devices for e-lexicography
	Bothma, Theo J. D.; Prinsloo, Danie; Heid, Ulrich
	In: Lexicographica. Internationales Jahrbuch für Lexikographie. International annual for lexicography. Revue internationale de lexicographie 33 (2018), 391-422
	IDS OBELEX meta
	Show details

7	Semi-automating the Reading Programme for a Historical Dictionary Project
	van Niekerk, Tim; Schäfer, Johannes; Heid, Ulrich
	In: Lexikos; Vol. 28 (2018) ; 2224-0039 (2018)
	BASE
	Show details

8	Direct User Guidance in e-Dictionaries for Text Production and Text Reception - The Verbal Relative in Sepedi as a Case Study
	Prinsloo, Danie; Bothma, Theo J. D.; Heid, Ulrich...
	In: Lexikos. Journal of the African Association for Lexicography 27 (2017), 403-426
	IDS OBELEX meta
	Show details

9	Direct User Guidance in e-Dictionaries for Text Production and Text Reception — The Verbal Relative in Sepedi as a Case Study
	Prinsloo, D.J.; Bothma, Theo J.D.; Heid, Ulrich...
	In: Lexikos; Vol. 27 (2017) ; 2224-0039 (2017)
	BASE
	Show details

10	Enabling Selective Queries and Adapting Data Display in the Electronic Version of a Historical Dictionary
	van Niekerk, Tim; Stadler, Heike; Heid, Ulrich
	In: Proceedings of the 17th EURALEX International Congress: Lexicography and Linguistic Diversity. Tbilisi, Georgia 6 - 10 September 2016 (2016), 635-646
	IDS OBELEX meta
	Show details

11	French Specialised Medical Constructions: Lexicographic Treatment and Corpus Coverage in General and Specialised Dictionaries
	Wandji Tchami, Ornella; Heid, Ulrich; Grabar, Natalia
	In: Proceedings of the 17th EURALEX International Congress: Lexicography and Linguistic Diversity. Tbilisi, Georgia 6 - 10 September 2016 (2016), 521-528
	IDS OBELEX meta
	Show details

12	Semantic and syntactic properties of verbs of communication
	Proost, Kristel [Verfasser]; Glatz, Daniel [Verfasser]; Heid, Ulrich [Herausgeber]. - Mannheim : Institut für Deutsche Sprache, Bibliothek, 2015
	DNB Subject Category Language
	Show details

13	Recent Initiatives towards New Standards for Language Resources
	Herzog, Gottfried Verfasser]. - Mannheim : Institut für Deutsche Sprache, Bibliothek, 2015
	DNB Subject Category Language
	Show details

14	Multilingual language resources and interoperability
	Witt, Andreas [Verfasser]; Heid, Ulrich [Verfasser]; Sasaki, Felix [Verfasser]. - Mannheim : Institut für Deutsche Sprache, Bibliothek, 2015
	DNB Subject Category Language
	Show details

15	Recent Initiatives towards New Standards for Language Resources
	Herzog, Gottfried; Heid, Ulrich; Trippel, Thorsten...
	In: GSCL 2015: Proceedings of the Int. Conference of the German Society for Computational Linguistics and Language Technology, University of Duisburg-Essen, Germany, Sep 30-Oct 2, 2015 (2015), 154-156
	IDS Bibliografie zur Gesprächsforschung
	Show details

16	Corpora
	Heid, Ulrich
	In: Word-Formation. An International Handbook of the Languages of Europe. Volume 3 (2015), 2354-2371
	IDS Bibliografie zur deutschen Grammatik
	Show details

17	Recent Initiatives towards New Standards for Language Resources
	Herzog, Gottfried; Heid, Ulrich; Trippel, Thorsten...
	In: International Conference of the German Society for Computational Linguistics and Language Technology ; https://hal.inria.fr/hal-01464476 ; International Conference of the German Society for Computational Linguistics and Language Technology, Sep 2015, Essen, Germany (2015)
	BASE
	Show details

18	Distinguishing specialised discourse: the example of juridical texts on industrial property rights and trademark legislation
	Cap, Fabienne [Verfasser]; Heid, Ulrich [Verfasser]. - Stuttgart : Universitätsbibliothek der Universität Stuttgart, 2014
	DNB Subject Category Language
	Show details

19	Resource interoperability revisited
	Eckart, Kerstin [Verfasser]; Heid, Ulrich [Verfasser]. - Hildesheim : Universitätsbibliothek Hildesheim, 2014
	DNB Subject Category Language
	Show details

20	Natural Language Processing Techniques for Improved User-friendliness of Electronic Dictionaries
	Heid, Ulrich
	In: Proceedings of the 16th EURALEX International Congress: The User in Focus, Bolzano/Bozen, Italien 15 - 19 July 2014 (2014), 47-61
	IDS OBELEX meta
	Show details

Page: 1 2 3 4 5...8

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern