DK-CLARIN LSP Corpus - IT domain
Please use the following text to cite this item or export to a predefined format:
Centre for Language Technology, NorS, University of Copenhagen and The Danish Language Council, 2011,
DK-CLARIN LSP Corpus - IT domain, CLARIN-DK-UCPH Centre Repository,
http://hdl.handle.net/20.500.12115/15.
Authors
Item identifier
Date issued
2011
Size
1101059 words,
66 files
Language(s)
Description
Texts in the IT Domain come from Libris, Open Office, Aktuel Naturvidenskab and have been collected in the DK-CLARIN project, WP2.2, 2008 - 2011.
The corpus consists of 1,101,059 words in 66 files.
Communicative setting/Number of files: expert->advanced (5) expert->basic (61).
All texts are in XML TEIP5 format (TEIP5DKCLARIN-format), with tokenisation, sentence and paragraph segmentation, pos-tagging, lemmatisation and termhood annotation placed in separate text external spangroups.
"DK-CLARIN LSP Corpus - IT domain" is a part of the Danish DK-CLARIN LSP corpus consisting of seven sub-corpora from following subject domains: Agriculture, Construction, Economics, Environment, Health, IT and Nanotechnology.
Acknowledgement
n/a
Project code:n/a
Project name:DK-CLARIN
Subject(s)
Collections
Files in this item
- Name
- OO_it.zip
- Size
- 3.07 MB
- Format
- application/zip
- Description
- Corpus
- MD5
- 18153c841b996557afaacf1f3f34f956

The file preview has not been generated yet. Please try again later or contact the system administrator info@clarin.dk
- Name
- textCorpusProfile.xsd
- Size
- 142.26 KB
- Format
- text/xml
- Description
- Schema
- MD5
- 7d6b452b88175041133ea8020e453cd8

The file preview has not been generated yet. Please try again later or contact the system administrator info@clarin.dk
- Name
- README_IT.txt
- Size
- 2.96 KB
- Format
- text/plain
- Description
- readme
- MD5
- 60d7afae3450185ccad1f0712b360043

The file preview has not been generated yet. Please try again later or contact the system administrator info@clarin.dk
- Name
- DKCLARIN_fagsprogligt_korpus_dokumentation_2011.pdf
- Size
- 361.81 KB
- Format
- application/pdf
- Description
- Documentation
- MD5
- e1752deaa6888e2f856811c8d933e655

The file preview has not been generated yet. Please try again later or contact the system administrator info@clarin.dk
- Name
- text-format.pdf
- Size
- 111.77 KB
- Format
- application/pdf
- Description
- Documentation
- MD5
- c4c4b5f1cd83ff232c44bc7692621da7

The file preview has not been generated yet. Please try again later or contact the system administrator info@clarin.dk
- Name
- AktuelNaturvidenskab.zip
- Size
- 1.83 MB
- Format
- application/zip
- Description
- Corpus
- MD5
- 320cc51634aeae8fe50a93e2b8819628

The file preview has not been generated yet. Please try again later or contact the system administrator info@clarin.dk
- Name
- text-header.pdf
- Size
- 375.79 KB
- Format
- application/pdf
- Description
- Documentation
- MD5
- 47825d0010a398bf10ce1564da2a15f0

The file preview has not been generated yet. Please try again later or contact the system administrator info@clarin.dk
- Name
- dkclarin-LSPIT-cmdi_textCorpus.xml
- Size
- 17.23 KB
- Format
- text/xml
- Description
- CMDI metadata
- MD5
- b5f5819d30472737ec6f4255329c24d6

The file preview has not been generated yet. Please try again later or contact the system administrator info@clarin.dk
- Name
- libris_it.zip
- Size
- 39.68 MB
- Format
- application/zip
- Description
- Corpus
- MD5
- 8a44346cfa1255c26f3f62014fcb2fa1

The file preview has not been generated yet. Please try again later or contact the system administrator info@clarin.dk
- Name
- teiHeader.xsd
- Size
- 59.88 KB
- Format
- text/xml
- Description
- Schema
- MD5
- 9fc5374ad34319278f437b963454f972

The file preview has not been generated yet. Please try again later or contact the system administrator info@clarin.dk

