Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.
In the beginning were C and C++, and hosts of other computer programming languages. These are all based on ASCII (American Standard Code for Information Interchange), which as the name implies is based on the English alphabet. Which wouldn’t be an issue except there are lot of other humans in the world, and they don’t use the English alphabet.
So along came Unicode to the rescue. Unicode provides a framework for all alphabets of the world to be represented on computers. UTF-8 is the most popular Unicode implementation because it preserves backwards compatibility with ASCII. Which is all fun to know, but what good is it when you’re looking at piles of computer files that need to converted from ISO-8859-1 (Latin-1, Western European) into whatever encoding you prefer? Naturally, there are a number of utilities just for this task.
GNU Recode supports over 150 character sets, and converts just about anything to anything. For example, there are still users of legacy Linux systems that still run ISO-8859-1. Recode will convert these to nice modern UTF-8, like this:
$ recode UTF-8 recode-test.txt
Check out the GNU Recode Manual for instructions.
That’s fast and easy enough, but there’s one more job- converting the filename. The convmv command is just the tool for this job. This example converts all the ISO-8859-1 filenames in the files/ directory to UTF-8:
$ convmv -f iso-8859-1 -t utf8 --notest files/
convmv run without the –notest option does a dry-run without changing anything, which is probably a wise thing to do first.
Maybe you have a file that you don’t know what the encoding is. Upload your file to this online tool and it will tell you. You can even do file conversions here.
Resources
The subject of character encoding is huge and bewildering, especially for us dinosaurs from the typewriter era. By golly, when you hit a typewriter key it came out the same way every single time. Wikipedia has a number of excellent introductory articles:
Unicode
UTF-8
ISO/IEC 8859
This article was first published on LinuxPlanet.com.
RELATED NEWS AND ANALYSIS
-
Ethics and Artificial Intelligence: Driving Greater Equality
FEATURE | By James Maguire,
December 16, 2020
-
AI vs. Machine Learning vs. Deep Learning
FEATURE | By Cynthia Harvey,
December 11, 2020
-
Huawei’s AI Update: Things Are Moving Faster Than We Think
FEATURE | By Rob Enderle,
December 04, 2020
-
Keeping Machine Learning Algorithms Honest in the ‘Ethics-First’ Era
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 18, 2020
-
Key Trends in Chatbots and RPA
FEATURE | By Guest Author,
November 10, 2020
-
Top 10 AIOps Companies
FEATURE | By Samuel Greengard,
November 05, 2020
-
What is Text Analysis?
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 02, 2020
-
How Intel’s Work With Autonomous Cars Could Redefine General Purpose AI
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 29, 2020
-
Dell Technologies World: Weaving Together Human And Machine Interaction For AI And Robotics
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 23, 2020
-
The Super Moderator, or How IBM Project Debater Could Save Social Media
FEATURE | By Rob Enderle,
October 16, 2020
-
Top 10 Chatbot Platforms
FEATURE | By Cynthia Harvey,
October 07, 2020
-
Finding a Career Path in AI
ARTIFICIAL INTELLIGENCE | By Guest Author,
October 05, 2020
-
CIOs Discuss the Promise of AI and Data Science
FEATURE | By Guest Author,
September 25, 2020
-
Microsoft Is Building An AI Product That Could Predict The Future
FEATURE | By Rob Enderle,
September 25, 2020
-
Top 10 Machine Learning Companies 2021
FEATURE | By Cynthia Harvey,
September 22, 2020
-
NVIDIA and ARM: Massively Changing The AI Landscape
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
September 18, 2020
-
Continuous Intelligence: Expert Discussion [Video and Podcast]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 14, 2020
-
Artificial Intelligence: Governance and Ethics [Video]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 13, 2020
-
IBM Watson At The US Open: Showcasing The Power Of A Mature Enterprise-Class AI
FEATURE | By Rob Enderle,
September 11, 2020
-
Artificial Intelligence: Perception vs. Reality
FEATURE | By James Maguire,
September 09, 2020