-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathtexttodict.1
47 lines (34 loc) · 1.59 KB
/
texttodict.1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# LQi 25.11.2020
# help file for texttodict command
texttodict - a shell script to create aspell and list of words dictionaries
General usage:
>> texttodict COMMAND [OPTIONS] inputfile
To get the version number and quit:
>> texttodict -version
To show this help and quit:
>> texttodict -help
To suppress all messages except the generated output:
>> texttodict -nomsg COMMAND [OPTIONS] inputfile
To build dictionary for aspell from a text file with words on separate lines:
>> texttodict -dict build [-l lang] [-o filename] inputfile.txt
To create a text file with words on separate files from aspell dictionary:
>> texttodict -dict dump [-l lang] -o filename
The default language is US English (en_US) and the default output filename
is 'inputfile.dict'. Please consult 'man aspell' for proper use of aspell.
To generate bag of words (bow) from input file to given output file (or
using default output file name inputfile.tdict):
>> texttodict -bow [OPTIONS] inputfile
where OPTIONS:
-ef {filename} exclude stop words in given file
-es {string} exclude space separated words in given string
-ep {pat} removes words matching the pattern {pat}
-w {nubmber} exclude words shorter than {number} of characters
-{number} truncate the output to {number} of lines
-gt|-eq|lt|-ge|-le {number} keep only words with given occurrences
-o filename output filename
-so sort output by number of occurrences
-sa sort output alphanumerically
-sar sort output alphanumerically in reverse order
-sl sort by length of strings
-slr sort by length of strings in reverse order
EOF