Many times we need to deal with the UTF-8 encoded file in our application. This may be due to localization needs or simply processing user input out of some requirements.
Even some data sources may provide data in UTF-8 format only. In this Java tutorial, we will learn two very simple examples of reading and writing UTF-8 content from a file.
1. Writing UTF-8 Encoded Data into a File
The given below is a Java example to demonstrate how to write “UTF-8” encoded data into a file. It uses the character encoding “UTF-8” while creating the OutputStreamWriter
.
import java.io.BufferedWriter; import java.io.File; import java.io.FileOutputStream; import java.io.IOException; import java.io.OutputStreamWriter; import java.io.UnsupportedEncodingException; import java.io.Writer; import java.nio.charset.StandardCharsets; public class WriteUTF8Data { public static void main(String[] args) { try { File fileDir = new File("c:\\temp\\test.txt"); Writer out = new BufferedWriter(new OutputStreamWriter (new FileOutputStream(fileDir), StandardCharsets.UTF_8)); out.append("Howtodoinjava.com").append("\r\n"); out.append("UTF-8 Demo").append("\r\n"); out.append("क्षेत्रफल = लंबाई * चौड़ाई").append("\r\n"); out.flush(); out.close(); } catch (UnsupportedEncodingException e) { System.out.println(e.getMessage()); } catch (IOException e) { System.out.println(e.getMessage()); } catch (Exception e) { System.out.println(e.getMessage()); } } }
How to compile and run java program written in another language
2. Reading UTF-8 Encoded File
We need to pass StandardCharsets.UTF_8
into the InputStreamReader
constructor to read data from a UTF-8 encoded file.
import java.io.BufferedReader; import java.io.File; import java.io.FileInputStream; import java.io.IOException; import java.io.InputStreamReader; import java.io.UnsupportedEncodingException; public class ReadUTF8Data { public static void main(String[] args) { try { File fileDir = new File("c:\\temp\\test.txt"); BufferedReader in = new BufferedReader( new InputStreamReader( new FileInputStream(fileDir), "UTF8")); String str; while ((str = in.readLine()) != null) { System.out.println(str); } in.close(); } catch (UnsupportedEncodingException e) { System.out.println(e.getMessage()); } catch (IOException e) { System.out.println(e.getMessage()); } catch (Exception e) { System.out.println(e.getMessage()); } } }
Program Output:
Howtodoinjava.com UTF-8 Demo क्षेत्रफल = लंबाई * चौड़ाई
Happy Learning !!
Rakhi
instead of giving output क्षेत्रफल = लंबाई * चौड़ाई it is giving ????=???? * ???. how to overcome this error?
Lokesh Gupta
Follow the instruction giver here.
Shivam
Hi Lokesh,
My question is somewhat different from this post.
How to escape accented characters(it may include UTF-8, UTF-16…) apart from using apache commans.lang library.
Thanks in advance.
Regards,
Lokesh Gupta
Normalizer.normalize(string, Normalizer.Form.NFD);
Read more: https://stackoverflow.com/questions/3322152/is-there-a-way-to-get-rid-of-accents-and-convert-a-whole-string-to-regular-lette