Monday, April 30, 2018

Comparing objects in Java

This is our example class:
public class Person {
 private final int age;
 private final String name;
 public Person(int age, String name) {
  this.age = age;
  this.name = name;
 }
 public int getAge() {
  return age;
 }
 public String getName() {
  return name;
 }
}
When implementing this.compareTo(that) or compare(this, that) follow these rules:
  • return -1 if this is less than that
  • return 0 if this equals to that, also should give the same result as Object.equals()
  • return 1 if this is greater than that
The two basic ways to achieve comparison are the followings:

Implementing the Comparable interface

public class Person implements Comparable<Person> {
 // ...
 @Override
 public int compareTo(Person that) {
  if (this.getName().compareTo(that.getName()) == 0) {
   if (this.getAge() < that.getAge()) return -1;
   else if (this.getAge() > that.getAge()) return 1;
   else return 0;
  }
  else return this.getName().compareTo(that.getName());
 }
 // ...
}

Using a Comparator

The Comparator interface can be implemented by an anonymous class in the class or at the place of usage like when giving it to Collections.sort() as an argument.
If only the compare() method is implemented that's enough.
public class Person {
 // ...
 public static Comparator<Person> byAge() {
  return new Comparator<Person>() {
   @Override
   public int compare(Person p1, Person p2) {
    if (p1.getAge() < p2.getAge()) return -1;
    else if (p1.getAge() > p2.getAge()) return 1;
    else return p1.getName().compareTo(p2.getName());
   }
  };
 }

 public static Comparator<Person> byName() {
  return new Comparator<Person>() {
   @Override
   public int compare(Person p1, Person p2) {
    if (p1.getName().equals(p2.getName())) {
     if (p1.getAge() < p2.getAge()) return -1;
     else if (p1.getAge() > p2.getAge()) return 1;
     else return 0;
    }
    else return p1.getName().compareTo(p2.getName());
   }
  };
 }
 // ...
}

Sorting Objects

Now that we have ways to compare the object, we can use that for sorting:
private Person a = new Person(20, "Amanda");
private Person b = new Person(20, "Anna");
private Person c = new Person(21, "Anna");
private Person d = new Person(19, "Beata");
private List<Person> list = Arrays.asList(b, a, d, c);

// using the Comparable implementation
Collections.sort(list);
assertEquals(Arrays.asList(a, b, c, d), list);
Collections.reverse(list);
assertEquals(Arrays.asList(d, c, b, a), list);
Collections.sort(list, Collections.reverseOrder());
assertEquals(Arrays.asList(d, c, b, a), list);

// using the Comparator implementation
Collections.sort(list, Person.byAge());
assertEquals(Arrays.asList(d, a, b, c), list);
Collections.sort(list, Collections.reverseOrder(Person.byAge()));
assertEquals(Arrays.asList(c, b, a, d), list);
Further readings:

Sunday, April 22, 2018

Eclipse: frequently used keyboard shortcuts

File
Save Ctrl+S
Save All Shift+Ctrl+S
Refactor - Java
Extract Local Variable Shift+Alt+L
Extract Method Shift+Alt+M
Inline Shift+Alt+I
Rename - Refactoring Shift+Alt+R
Source
Format Shift+Ctrl+F
Organize Imports Shift+Ctrl+O
Toggle Comment Shift+Ctrl+C
Ctrl+/
Ctrl+7
Toggle Mark Occurrences Shift+Alt+O
Text Editing
Copy Lines Ctrl+Alt+Down
Delete Line Ctrl+D
Duplicate Lines Ctrl+Alt+Up
Insert Line Above Current Line Shift+Ctrl+Enter
Insert Line Below Current Line Shift+Enter
Join Lines Ctrl+Alt+J
Line End End
Line Start Home
Move Lines Down Alt+Down
Move Lines Up Alt+Up
Toggle Block Selection Shift+Alt+A

Saturday, April 21, 2018

Java basics - localized Strings

Formatting numbers with Formatter in (Hungarian) locale

Java provides two ways to format text: using Formatter or using Format. This part presents the use of Formatter.

Formatter syntax: %[argument_index$][flags][width][.precision]conversion
Some useful flags:
  • , for including locale-specific grouping separators
  • 0 for zero-padding
  • + for always including a sign
Default behavior: 
  • The output is right-justified within the width
  • Negative numbers begin with a '-' ('\u002d')
  • Positive numbers and zero do not include a sign or extra leading space
  • No grouping separators are included
  • The decimal separator will only appear if a digit follows it

In the following example we'll use this locale:
Locale huLocale = Locale.forLanguageTag("hu-HU");

Non floating-point decimal numbers

remember d for decimal
assertEquals("0009", String.format(huLocale, "%0,4d", 9));
assertEquals("01 000", String.format(huLocale, "%0,6d", 1000));

assertEquals("100000000", String.format(huLocale, "%d", 100000000));
assertEquals("100 000 000", String.format(huLocale, "%,d", 100000000));
assertEquals("+100000000", String.format(huLocale, "%+d", 100000000));

Floating point decimal numbers

remember f for floating point
assertEquals("1000,246800",String.format(huLocale, "%f", 1000.2468));
assertEquals("1 000,246800", String.format(huLocale, "%,f", 1000.2468));
assertEquals("1 000,25", String.format(huLocale, "%,.2f", 1000.2468));

Scientific decimal numbers

assertEquals("1,000247e+03", String.format(huLocale, "%e", 1000.2468));
assertEquals("1,00024680e+03", String.format(huLocale, "%.8e", 1000.2468));

Automatic switching between scientific and floating point formatting

assertEquals("123,456", String.format(huLocale, "%g", 123.456));
assertEquals("123,5", String.format(huLocale, "%.4g", 123.456));
assertEquals("123", String.format(huLocale, "%.3g", 123.456));
assertEquals("1,2e+02", String.format(huLocale, "%.2g", 123.456));

Sources:

Parsing localized (Hungarian) string into double

Locale huLocale = Locale.forLanguageTag("hu-HU");
NumberFormat nf = NumberFormat.getInstance(huLocale);
assertEquals(1.0, nf.parse("1,0").doubleValue(), 1);

Sources:

Sorting in (Hungarian) localized order

To sort in (localized) natural language order one must use a Collator.
Once there is a Collator, it can be used to sort like this:
Collections.sort(list, myCollator);
For preconstructed locales eg. en-US one can use Collator.getInstance(Locale.US);
For other languages one can try Collator.getInstance(Locale.forLanguageTag("hu-HU"));
But if that does not work as expected one has to specify the sorting rules for the language themselves:
List<String> list = Arrays.asList("az", "áll", "CT", "csomag", "ez", "ég", "itt", "így",
  "Ozora", "óvoda", "ötös", "őriz", "uzsonna", "út", "ütő", "űr");

Collator huCollator = Collator.getInstance(Locale.forLanguageTag("hu-HU"));
Collections.sort(list, huCollator);
assertEquals(Arrays.asList("áll", "az", "CT", "csomag", "ég", "ez", "így", "itt",
  "óvoda", "Ozora", "ötös", "őriz", "út", "uzsonna", "ütő", "űr"), list);

String hungarian = "< a,A < á,Á < b,B < c,C < cs,Cs,CS < d,D < dz,Dz,DZ" +
      " < dzs,Dzs,DZS < e,E < é,É < f,F < g,G < gy,Gy,GY < h,H < i,I " +
      "< í,Í < j,J < k,K < l,L < ly,Ly,LY < m,M < n,N < ny,Ny,NY < o,O " +
      "< ó,Ó < ö,Ö < ő,Ő < p,P < q,Q < r,R < s,S < sz,Sz,SZ < t,T < ty,Ty,TY " +
      "< u,U < ú,Ú < ü,Ü < ű,Ű < v,V < w,W < x,X < y,Y < z,Z < zs,Zs,ZS";
Collator collator = new RuleBasedCollator(hungarian);
Collections.sort(list, collator);
assertEquals(Arrays.asList("az", "áll", "CT", "csomag", "ez", "ég", "itt", "így",
  "Ozora", "óvoda", "ötös", "őriz", "uzsonna", "út", "ütő", "űr"), list);

Sources:

Sunday, April 15, 2018

Java basics - working with text

Sources:

About Unicode in Java

Java is using Unicode 16-bit representation internally.
Unicode distinguishes between the association of characters as abstract concepts (e.g., "Greek capital letter omega Ω") to a subset of the natural numbers, called code point on the one hand, and the representation of code points by values stored in units of a computer's memory. The Unicode standard defines seven of these character encoding schemes.
In Java, the 65536 numeric values of Unicode are UTF-16 code units, values that are used in the UTF-16 encoding of Unicode texts. Any representation of Unicode must be capable of representing the full range of code points, its upper bound being 0x10FFFF. Thus, code points beyond 0xFFFF need to be represented by pairs of UTF-16 code units, and the values used with these so-called surrogate pairs are exempt from being used as code points themselves.
The full range of Unicode code points can only be stored in a variable of type int. The actual number of Unicode characters cannot be represented in a char variable.

When to be cautious with Characters?

It is possible that a String value contains surrogate pairs intermingled with individual code units.
In such cases one character can take up two indices in the string.
To verify if the string consists of only individual code units one can use:
s.lenght() == s.codePointCount(0, s.length())
...because String.length() method returns the number of code units, or 16-bit char values, in the string, while the String.codePointCount() method returns the count of the number of characters (including supplementary characters).
If you have to process strings containing surrogate pairs, there's an implementation of a unicode charAt method in this article (using offsetByCodePoints and codePointAt methods of string).

Regarding conversion to uppercase and lowercase, use the String.toUpperCase() and String.toLowerCase() methods only because those handle all cases of conversions correctly compared to the Character implementations.

Working with text in Java

If your editor and file system allow it, you can use Java's native UTF-16 characters directly in your code.
Always use 'single quotes' for char literals and "double quotes" for String literals.
Escape sequences for char and String literals: \b (backspace), \t (tab), \n (line feed), \f (form feed), \r (carriage return), \" (double quote), \' (single quote), and \\ (backslash).

Primitive type: char

char is a single 16-bit Unicode character. It has a minimum value of '\u0000' (or 0) and a maximum value of '\uffff' (or 65,535 inclusive).
Default value: '\u0000'
Check for default value: ch == Character.MIN_VALUE or ch == 0

Non-primitive types: Character and String

How to decide if something is a letter, a digit, or whitespace?

Use Java's built in Character class for that:
char ch = '\u0041';
assertEquals('A', ch );
assertFalse(Character.isDigit(ch));
assertTrue(Character.isLetter(ch));
assertTrue(Character.isLetterOrDigit(ch));
assertFalse(Character.isLowerCase(ch));
assertTrue(Character.isUpperCase(ch));
assertFalse(Character.isSpaceChar(ch));
assertTrue(Character.isDefined(ch));

How to decide if some letter is in the English Alphabet?

char ch = 'A';
assertTrue(((ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z')));
ch = 'á';
assertFalse(((ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z')));
Or using regex:
Pattern p = Pattern.compile("[A-Za-z]");
assertTrue(p.matcher("a").find());
assertFalse(p.matcher("Á").find()); 

Sorting text

The default String comparator compares based on the unicode values of characters. (putting uppercase before lowercase, etc.)
To sort in (localized) natural language order one must use a Collator. An example usage in shown is this article, here's a lengthier demonstration, and here's some information on how customize sorting rules.
List list = Arrays.asList("alma", "Anna", "Ági", "ágy");
Collections.sort(list);
assertEquals(Arrays.asList("Anna", "alma", "Ági", "ágy"), list);

Collections.sort(list, String.CASE_INSENSITIVE_ORDER);
assertEquals(Arrays.asList("alma", "Anna", "Ági", "ágy"), list);

Collator huCollator = Collator.getInstance(Locale.forLanguageTag("hu-HU"));
Collections.sort(list, huCollator);
assertEquals(Arrays.asList("Ági", "ágy", "alma", "Anna"), list);

Splitting and joining text

Conversion between String and char

String str = "My fancy text";
char[] chars = str.toCharArray();
String joined = new String(chars);
assertEquals(str, joined);

Splitting and joining Strings

String str = "My fancy text";
String[] splitted = str.split(" ");
String joined = String.join(" ", splitted);
assertEquals(str, joined );

Parsing text

Scanner

A Scanner breaks its input into tokens using a delimiter pattern.
Default delimiter: whitespace. Set it with Scanner.useDelimiter()
Localization for reading numbers: via the Scanner.useLocale(locale) method.
Reset to defaults with Scanner.reset() method.
Delimiters:
  • regex
Navigate with Scanner.next() returns Object between the current and the next delimiter.

BreakIterator

The BreakIterator class implements methods for finding the location of boundaries in text. Instances of BreakIterator maintain a current position and scan over text returning the index of characters where boundaries occur.
Boundaries:
  • Character
  • Word
  • Sentence
  • Line
Navigate with BreakIterator.next() and BreakIterator.previous() - returns next int index of boundary.
For info on usage see the Java Tutorial.

StringTokenizer

StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

String.split()

Splits a string around matches of the given regular expression. Returns an array.
With the regex match it works like Scanner but parses the whole text at once.
Delimiter defaults to whitespace.

Pattern matching

java.util.regex contains classes for matching character sequences against patterns specified by regular expressions.
An instance of the Pattern class represents a regular expression that is specified in string form in a syntax similar to that used by Perl.
Instances of the Matcher class are used to match character sequences against a given pattern. Input is provided to matchers via the CharSequence interface in order to support matching against characters from a wide variety of input sources.

The different matching methods of Matcher

Pattern pattern = Pattern.compile("foo");

// find: all occurrences one by one
assertTrue(pattern.matcher("afooo").find());
assertFalse(pattern.matcher("aooo").find());
// find starting at given index
assertTrue(pattern.matcher("afooo").find(0));
assertTrue(pattern.matcher("afooo").find(1));
assertFalse(pattern.matcher("afooo").find(2));

// lookingAt: like String.startsWith() but with regex
assertTrue(pattern.matcher("fooo").lookingAt());
assertFalse(pattern.matcher("afooo").lookingAt());

// matches: like String.equals() but with regex
assertTrue(pattern.matcher("foo").matches());
assertFalse(pattern.matcher("fooo").matches());

Retrieving matched subsequences 

The explicit state of a matcher includes the start and end indices of the most recent successful match. It also includes the start and end indices of the input subsequence captured by each capturing group in the pattern as well as a total count of such subsequences. This can be used to retrieve what is matched:
Pattern pattern = Pattern.compile("f.o");
Matcher matcher = pattern.matcher("afaoofeoofo");
assertTrue(matcher.find()); // finds the first match
assertEquals("fao", matcher.group()); 
assertTrue(matcher.find()); // finds the second match
assertEquals("feo", matcher.group());
assertFalse(matcher.find()); // no more to find
matcher.reset(); // resets the matcher
assertTrue(matcher.find()); // finds the first match again
assertEquals("fao", matcher.group());

Iterating over the matches

while(matcher.find()) {
 String group = matcher.group();
}

Using capturing groups

Pattern pattern = Pattern.compile("(f(.)o)");
Matcher matcher = pattern.matcher("afaoofeoofuo");
assertEquals(2, matcher.groupCount()); // groups specified in pattern
assertTrue(matcher.find()); // finds the first match again
assertEquals("fao", matcher.group(1)); //referencing the capturing group
assertEquals("a", matcher.group(2)); //referencing the capturing group

Making replacements

Replace the first substring of a string that matches the given regular expression with the given replacement:
  • str.replaceFirst(regex, repl) yields exactly the same result as 
  • Pattern.compile(regex).matcher(str).replaceFirst(repl)
Replace each substring of this string that matches the given regular expression with the given replacement:
  • str.replaceAll(regex, repl) yields exactly the same result as 
  • Pattern.compile(regex).matcher(str).replaceAll(repl)

Making complex replacements

To have more control on the replacement, use Matcher.appendReplacement() with Matcher.appendTail().
The most basic case: replace with fixed string
Pattern p = Pattern.compile("f.o");
Matcher m = p.matcher("afaoofeoofuo");
StringBuffer sb = new StringBuffer(); // the buffer to write the result to
while (m.find()) {
 m.appendReplacement(sb, "-"); // replace the whole match with the given string
}
m.appendTail(sb); // write the rest of the string after the last match to the buffer.
assertEquals("a-o-o-", sb.toString());
A more complex case: replace with multiple capturing groups
Pattern p = Pattern.compile("(f)(.)(o)");
Matcher m = p.matcher("afaoofeoofuo");
StringBuffer sb = new StringBuffer();
while (m.find()) {
 m.appendReplacement(sb, "$1-$3"); // replace only the second group
}
m.appendTail(sb);
assertEquals("af-oof-oof-o", sb.toString());
A more complex case: replace with value from map
Map map = new HashMap<>();
map.put("a", "1"); map.put("e", "2"); map.put("u", "3");
Pattern p = Pattern.compile("(f)(.)(o)");
Matcher m = p.matcher("afaoofeoofuo");
StringBuffer sb = new StringBuffer();
while (m.find()) {
 m.appendReplacement(sb, "$1" + map.get(m.group(2)) + "$3"); // replace only the second group
}
m.appendTail(sb);
assertEquals("af1oof2oof3o", sb.toString());
Note: If you want the replacement to contain $ or \ literals, wrap it in Matcher.quoteReplacement().

Escape special characters with double backslash

Regex patterns are specified within String literals. Java has some reserved escapes like \n for line break, so the regex escapes like \s need to be escaped with an extra \ resulting in \\s for matching a whitespace character.
The string literal "\b", for example, matches a single backspace character when interpreted as a regular expression, while "\\b" matches a word boundary.


Sources:


Friday, April 13, 2018

Some random notes on Java basics

Arrays and Lists

How to initialize an Array?

// Initialize by specifying the length (values default to 0)
int[] a = new int[5];
a[0] = 1;
// Initialize by specifying the values
int[] b = new int[]{1, 0, 0, 0, 0};

How to copy an Array?

int[] source = new int[]{1,2,3,4,5};
// Copy the whole array using Object.clone()
int[] a = source.clone();
// Copy from the beginning to a given index
int[] b = Arrays.copyOf(source, source.length);
// Copy from a given index to a given index
int[] c = Arrays.copyOfRange(source, 0, source.length);
// Copy from source from index to destination from index the given amount of items (void method)
int[] d = new int[source.length];
System.arraycopy(source, 0, d, 0, source.length);

How to initialize a List?

// Initialize with the new keyword
List a = new ArrayList<>();
a.add(1); a.add(0); a.add(0); a.add(0); a.add(0);
// Initialize with Arrays.asList
List b = Arrays.asList(1, 0, 0, 0, 0);

How to copy a List?

List newList = new ArrayList(otherList);

How to convert an Array of primitives to a List?

int[] source = new int[]{1, 0, 0, 0, 0};
// convert with for-each loop
List a = new ArrayList<>();
for (int n : source) {
a.add(n);
}
// convert with streams
List b = Arrays.stream(source).boxed().collect(Collectors.toList());

How to convert an Array of non-primitives to a List?

String[] strings = new String[]{"a", "b", "c"};
List<String> l = Arrays.asList(strings);

Scanner

If you want to read with Scanner from console input (System.in) in multiple methods within a single class you should only have one instance of that scanner. Because "after calling Scanner.close() on a Scanner which is associated with System.in, the System.in stream is closed and not available anymore."
Scanner s = new Scanner(System.in);
function1(s);
function2(s);
function3(s);
s.close();

Sunday, April 1, 2018

Java I/O basics - easy ways

Read from console (IDE compatible):

Scanner scanner = new Scanner(System.in);

System.out.print("Enter your nationality: ");
String nationality = scanner.nextLine();

// automatically parse primitives:
System.out.print("Enter your age: ");
int age = scanner.nextInt();
Source: CodeJava: 3 ways for reading input from the user in the console
Testing: StackOverflow: Unit testing for user input using Scanner

Write to console:

System.out.println("Some string..."); // with automatic line break
System.out.print("Some string..."); // no line break added
Source: Oracle docs
Testing: StackOverflow: JUnit test for System.out.println()

Read all lines from a file (java.nio):

List lines = Files.readAllLines(Paths.get("input.txt"));
Source: Java2S.com, example in my previous post

Write all lines to a file (java.nio):

Files.write(Paths.get("output.txt"), lines);
Source: Java2S.comexample in my previous post