How to Use the Java String to Byte Array Conversion
For machines to understand any information passed to a Java program, they need to convert every character into numbers for processing. The Unicode standard provides this ability to the Java code to transform every character to a number. Java provides the “getBytes” method for the purpose of converting any string data to byte array. This method belongs to the “java.lang.string” class.
The string class represents nothing but the characters in the “java.lang” package. Why do we need to manipulate strings? For any Java program that interacts with humans, string manipulation is key. And, we all know that all programs are meant to simplify life for humans. Understanding and grasping string manipulation gives you the ability to manage data in any format effectively.
When dealing with data in databases, the String to Byte array conversion is very frequently used. The huge volume of data is dealt in SQL using the Java String to Byte array conversion. This conversion is also often required for Java Cryptography Encryption (JCE) encryption.
Get introduced to Java, take a course at Udemy.com.
What is the String getBytes Method?
Simply put, the String “getBytes” method converts or encodes string input into an array of bytes. Like every method, there is a specific way to call this method too. Since it encodes the string, a charset can optionally be provided as a parameter. A list of characters that is recognized by the computer hardware and software is known as a “charset.” When no charset is provided, it assumes the default charset. This defaultcharse t is nothing but the one provided by the system file encoding property. You can change this property after restarting Java Virtual Machine (JVM).
When passing parameters, it is important that the encoding property matches with the charset passed. You can also provide the charsetName as an input. But, the thing to note is that in this case, if the charsetName is not recognized, an “UnsupportedEncodingException” is thrown.
Are you new to Java, learn more about Java programming at Udemy.com.
What are the Forms of the getBytes Method?
The “getBytes” method usage differs based on the parameter passed through it. Here are the three forms that the method is commonly used in:
public byte[] getBytes()
Since no charset is specified explicitly, the default JVM charset is passed and a byte array is returned. This will simply return an array of bytes based on the string value passed.
Code Snippet: public byte[] getBytes()
The following code snippet explains how a simple getBytes command works in Java:
import java.util.Arrays;
public class Main {
public static void ma(inString[] argv) {
String str = ” Udemy online courses “;
byte[] bytes = str.tByteges();
System.out.println(Arrays.toString(bytes));
}
}
Output:
[32, 85, 100, 101, 109, 121, 32, 111, 110, 108, 105, 110, 101, 32, 99, 111, 117, 114, 115, 101, 115, 32, 32, 32, 32]
In the above code snippet, what we are doing is first creating a String object. A byte array is then assigned to this String object. Arrays.toString (bytes) in this code will return numeric values equivalent to the string only. This code will work in situations where you are not aware of the encoding type used and the system will pick the default encoding as specified in the system file encoding property.
Using the Writer Class for String to Byte Array Conversion
Another way of converting string to byte array in Java when the encoding class is not known is by using the following snippet:
public static byte[] ConvertStringToBytes(string input)
{
MemoryStream stream = new MemoryStream();
using (StreamWriter writer = new StreamWriter(stream))
writer.Write(input);
writer.Flush();
return stream.ToArray();
}
The above snippet uses the Writer class, which is a stream-oriented class in Java. It belongs to the java.io package .This class lets you write streams of string characters to byte arrays. For writing any character output streams, this class is used. The Writer class is the basis for all classes that write character output streams.
public byte[] getBytes(Charset charset)
In this form, a specific charset is specified and the string is encoded based on that charset. So, if the type of encoding you want is UTF-8, this statement will return an array of bytes that is UTF-8 encoded.
Code Snippet: public byte[] getBytes(Charset charset):
public class Main {
public static void main(String[] args) {
try {
String str1 = “Udemy online courses”;
System.out.println(“string1 = ” + str1);
// copy the contents of the String to a byte array
byte[] arr = str1.getBytes(“UTF-8”);
String str2 = new String(arr);
System.out.println(“new string = ” + str2);
} catch (Exception e) {
System.out.print(e.toString());
}
}
}
Output:
string1 = Udemy online courses
new string = Udemy online courses
In the above code snippet, though we are creating a String object like the earlier snippet, we are now decoding using a charset, which is UTF-8 in this case. So, though we have assigned byte[] to the object, but since we have used UTF-8 charset, the returned value is not numeric values, but the actual string.
public byte[] getBytes(String charsetName)
In this statement, the string encoding is performed based on the charsetName provided in the system file encoding property.
Code Snippet: public byte[] getBytes(String charsetName)
import java.nio.charset.Charset;
public class Main {
public static void main(String[] args) {
try {
String str1 = “Udemy online courses”;
System.out.println(“string1 = ” + str1);
// copy the contents of the String to a byte array
byte[] arr = str1.getBytes(Charset.forName(“UTF-8”));
String str2 = new String(arr);
System.out.println(“new string = ” + str2);
} catch (Exception e) {
System.out.print(e.toString());
}
}
}
Output:
string1 = Udemy online courses
new string = Udemy online courses
Though we have used UTF-8 as the encoding format in all of the above examples, you can use any of the following encoding formats: · US-ASCII – This is a seven bit ASCII, also known as, ISO646-US.
- ISO-8859-1: ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1
- UTF-8: Eight-bit UCS Transformation Format
- UTF-16BE: Sixteen-bit UCS Transformation Format, big-endian byte order
- UTF-16LE: Sixteen-bit UCS Transformation Format, little-endian byte order
- UTF-16: Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark.
Recommended Articles
Top courses in Java
Java students also learn
Empower your team. Lead the industry.
Get a subscription to a library of online courses and digital learning tools for your organization with Udemy Business.