Problem:
Given a list of strings , encode them and then decode them back.
Consider the use case where you want to send a list of strings over the network.
You have the limitation of using only string data structure.
And somehow you need to represent the list of strings as a string and then finally when it is received at the receiver end you need to decode it back as a list of strings.
For our test case,
we need to write two methods,
One which encodes the list of strings into a string.
One which decodes the string back into a list of strings.
Assumptions:
- The characters in the string can be any ASCII character (256 ASCII characters).
- The string length can be max of 200 and minimum of 0.
- The number of strings can be from 1 to 200.
Example:
Input – [“Hello”,”World”]
Output – [“Hello”,”World”] (same as input ,only the encoded output is different)
Try out the solution here:
https://leetcode.com/problems/encode-and-decode-strings/
Solution:
Hint:
- Use a non ascii delimiter and connect all the strings together
- Append the length of the string before every string
Using Non Ascii Delimiters:
One naïve solution to the problem is to separate the strings using a delimiter like “,” (comma).
But the problem is that the delimiter character might also be present in the string.
To overcome this we can use a non ascii character as a delimiter.
As given in the assumption , there are 256 ASCII characters. So we can use the character represented by the integer 257.
Also if the list is empty , we can use one more non ascii character to represent it , lets say 258.
So to encode,
For every string in the list , we add them to an output string and append the non ascii character .
To decode,
we split the array based on the delimiter and convert that to a list of strings and return.
Code:
public class Codec {
// Encodes a list of strings to a single string.
public String encode(List<String> strs) {
if(strs.size() == 0){
return Character.toString((char)257);
}
StringBuilder sb = new StringBuilder();
for(String s:strs){
sb.append(s);
sb.append((char)258);
}
sb.deleteCharAt(sb.length() - 1);
return sb.toString();
}
// Decodes a single string to a list of strings.
public List<String> decode(String s) {
if(s.equals(Character.toString((char)257))) return new ArrayList<>();
//Add the parameter -1 to consider empty strings.
String[] array = s.split(Character.toString((char)258),-1);
return Arrays.asList(array);
}
}
// Your Codec object will be instantiated and called as such:
// Codec codec = new Codec();
// codec.decode(codec.encode(strs));
If you notice in the above code while splitting the string into array in the decode method , we pass an additional parameter of “-1”. This will make sure empty strings are also considered.
Otherwise , an input like:
[“”,””] will be decoded as [] instead of [“”,””]
Prefixing String length (Chunked Transfer Encoding):
In this approach , you calculate the length of each string and append this behind the string.
Now how can you know which string represents the length and which one represents the actual string?
You can resolve this in two ways:
- Use fixed number of bytes to represent length , so when you parse the string you parse that fixed number of bytes , get the length and then collect characters from that position for the obtained length and continue this until you reach the end of the string
- Separate the length and the string by a delimiter. This approach works even when the strings themselves have the delimiter character.
The second approach is easier and more elegant , so lets look into it.
Here is the code:
Code:
public class Codec {
// Encodes a list of strings to a single string.
public String encode(List<String> strs) {
StringBuilder sb = new StringBuilder();
for(String s:strs){
sb.append(s.length());
sb.append("-");
sb.append(s);
}
return sb.toString();
}
// Decodes a single string to a list of strings.
public List<String> decode(String s) {
List<String> output = new ArrayList<>();
int index = 0;
while(index < s.length()){
int hyphenIndex = s.indexOf('-',index);
int length =Integer.valueOf(s.substring(index,hyphenIndex));
index = hyphenIndex + length+1;
String o = s.substring(hyphenIndex+1,index);
output.add(o);
}
return output;
}
}
// Your Codec object will be instantiated and called as such:
// Codec codec = new Codec();
// codec.decode(codec.encode(strs));
Encoding is straightforward in the above case:
- You append each string’s length , then the delimiter and finally the string itself
For decoding,
- You parse the string until you find the delimiter
- The parsed string represents the length of the first string
- You use this length to get the substring following the delimiter
- you repeat the process until the end of the string.
Time complexity for both the approaches is O(2n) – one parse for encoding and one for decoding , which can be considered as O(n) where n is the number of strings
That’s it!
Leave a Reply