Saturday, April 21, 2018

Java basics - localized Strings

Formatting numbers with Formatter in (Hungarian) locale

Java provides two ways to format text: using Formatter or using Format. This part presents the use of Formatter.

Formatter syntax: %[argument_index$][flags][width][.precision]conversion
Some useful flags:
  • , for including locale-specific grouping separators
  • 0 for zero-padding
  • + for always including a sign
Default behavior: 
  • The output is right-justified within the width
  • Negative numbers begin with a '-' ('\u002d')
  • Positive numbers and zero do not include a sign or extra leading space
  • No grouping separators are included
  • The decimal separator will only appear if a digit follows it

In the following example we'll use this locale:
Locale huLocale = Locale.forLanguageTag("hu-HU");

Non floating-point decimal numbers

remember d for decimal
assertEquals("0009", String.format(huLocale, "%0,4d", 9));
assertEquals("01 000", String.format(huLocale, "%0,6d", 1000));

assertEquals("100000000", String.format(huLocale, "%d", 100000000));
assertEquals("100 000 000", String.format(huLocale, "%,d", 100000000));
assertEquals("+100000000", String.format(huLocale, "%+d", 100000000));

Floating point decimal numbers

remember f for floating point
assertEquals("1000,246800",String.format(huLocale, "%f", 1000.2468));
assertEquals("1 000,246800", String.format(huLocale, "%,f", 1000.2468));
assertEquals("1 000,25", String.format(huLocale, "%,.2f", 1000.2468));

Scientific decimal numbers

assertEquals("1,000247e+03", String.format(huLocale, "%e", 1000.2468));
assertEquals("1,00024680e+03", String.format(huLocale, "%.8e", 1000.2468));

Automatic switching between scientific and floating point formatting

assertEquals("123,456", String.format(huLocale, "%g", 123.456));
assertEquals("123,5", String.format(huLocale, "%.4g", 123.456));
assertEquals("123", String.format(huLocale, "%.3g", 123.456));
assertEquals("1,2e+02", String.format(huLocale, "%.2g", 123.456));

Sources:

Parsing localized (Hungarian) string into double

Locale huLocale = Locale.forLanguageTag("hu-HU");
NumberFormat nf = NumberFormat.getInstance(huLocale);
assertEquals(1.0, nf.parse("1,0").doubleValue(), 1);

Sources:

Sorting in (Hungarian) localized order

To sort in (localized) natural language order one must use a Collator.
Once there is a Collator, it can be used to sort like this:
Collections.sort(list, myCollator);
For preconstructed locales eg. en-US one can use Collator.getInstance(Locale.US);
For other languages one can try Collator.getInstance(Locale.forLanguageTag("hu-HU"));
But if that does not work as expected one has to specify the sorting rules for the language themselves:
List<String> list = Arrays.asList("az", "áll", "CT", "csomag", "ez", "ég", "itt", "így",
  "Ozora", "óvoda", "ötös", "őriz", "uzsonna", "út", "ütő", "űr");

Collator huCollator = Collator.getInstance(Locale.forLanguageTag("hu-HU"));
Collections.sort(list, huCollator);
assertEquals(Arrays.asList("áll", "az", "CT", "csomag", "ég", "ez", "így", "itt",
  "óvoda", "Ozora", "ötös", "őriz", "út", "uzsonna", "ütő", "űr"), list);

String hungarian = "< a,A < á,Á < b,B < c,C < cs,Cs,CS < d,D < dz,Dz,DZ" +
      " < dzs,Dzs,DZS < e,E < é,É < f,F < g,G < gy,Gy,GY < h,H < i,I " +
      "< í,Í < j,J < k,K < l,L < ly,Ly,LY < m,M < n,N < ny,Ny,NY < o,O " +
      "< ó,Ó < ö,Ö < ő,Ő < p,P < q,Q < r,R < s,S < sz,Sz,SZ < t,T < ty,Ty,TY " +
      "< u,U < ú,Ú < ü,Ü < ű,Ű < v,V < w,W < x,X < y,Y < z,Z < zs,Zs,ZS";
Collator collator = new RuleBasedCollator(hungarian);
Collections.sort(list, collator);
assertEquals(Arrays.asList("az", "áll", "CT", "csomag", "ez", "ég", "itt", "így",
  "Ozora", "óvoda", "ötös", "őriz", "uzsonna", "út", "ütő", "űr"), list);

Sources:

No comments:

Post a Comment