Member-only story
Custom String Comparison in Java
A hidden gem in the standard Java library is the RuleBasedCollator class. It allows you to define custom collation — a set of rules for comparison between different characters in a string — in a flexible way. Your use cases might be one of the following:
- you need to sort strings with uppercase characters with higher or lower priority than lowercase characters;
- you want to set higher or lower precedence for characters with an accent (like à) or any other non-Latin characters;
- you need to specify any other non-standard rules for character precedence that would be too complicated to define in a
Comparator
implementation.
The RuleBasedCollator
class allows you to set the rules in a convenient and flexible manner, just by writing them as a string expression. For instance, if you sort the following list in a usual order, you get the expected result (I’m using the Stream.toList()
method from Java 16 here — in earlier versions of Java you can just use .collect(Collectors.toList())
):