Are you trying to get character encoding of a file in Linux? Well, follow through this guide to learn some simple ways that you can use to find or get character encoding of file in Linux.
Getting character encoding of a file in Linux
In Linux, there a number of commands that you can use to get character encoding of a file.
Such commands include:
- file
- encguess
- NPM dfeal
Get character encoding of a file using file
command in Linux
file
is a command in Linux that is used to determine other file types. It can as well be used to determine or get the character encoding of files.
Assuming you have a file, file.txt
, if you want to get its character encoding, run the command below;
file file.txt
Sample output;
file.txt: UTF-8 Unicode text
From the output, the character encoding of the file.txt is UTF-8
.
You can also pass option -i/--mime
to print the mime type strings such as text/plain; charset=us-ascii
rather than ASCII text
file -i file.txt
Sample output;
file.txt: text/plain; charset=utf-8
If you want to omit filenames from the command output, use option -b/--brief
.
file -ib file.txt
Sample output;
text/plain; charset=utf-8
Get character encoding of a file using encguess
command in Linux
encguess
is a command provided by the perl (Debian/Ubuntu) or perl/perl-Encode (RHEL based) package that can be used to guess character encodings of files.
The command line syntax;
encguess [options] filename
To use an example of my file above, file.txt;
encguess file.txt
Sample output;
file.txt UTF-8
Read more on man pages, man encguess
.
Get character encoding of a file using dfeal
command in Linux
dfeal (detect-file-encoding-and-language)
is an NPM command that is used determine the encoding and language of text files.
To install detect-file-encoding-and-language
, you first need to install NPM;
Ubuntu/Debian;
sudo apt install nodejs npm -y
RHEL based distros, see how to install NPM.
Next, install dfeal command;
sudo npm install -g detect-file-encoding-and-language
Getting the character encoding;
dfeal file.txt
{
"encoding": "UTF-8",
"language": "spanish",
"confidence": {
"encoding": 1,
"language": 0.02
}
}
There could be more commands to get the character encoding for a file in Linux. Leave them in the comment section.
That marks the end of our guide on how to character encoding of a file in Linux.