Saturday, 19 April 2014

Fail-Safe Vs Fail-Fast iterators in Java

Fail-Fast iterators :

Fail-fast iterators fail as soon as they realize that the structure of the collection has changed since the iteration has begun. By changing the structure of the collection, we mean adding, removing or updating an element in the collection. It is implemented by keeping a modification count and if the iterating thread realizes the change in modification count, it throws ConcurrentModificationException. Most of the collections in Java are fail-fast. This is not a guaranteed behavior. So, we should not rely on this behavior.

We can get ConcurrentModificationException if one thread is iterating over a collection and at the same time that collection is modified by some other thread. Please note that we can get ConcurrentModificationException even in single threaded applications. The following code snippet would throw ConcurrentModificationException :
Whereas, the following code snippet won't throw ConcurrentModificationException :

So, if we use Iterator's remove method, we won't get ConcurrentModificationException. To understand why the second approach doesn't throw ConcurrentModificationException, we need to understand how fail-fast iterators are implemented in Java. The fail-fast iterators are typically implemented using a volatile counter on the list object. When the list is updated, the counter is updated. When an iterator is created, the current value of the counter is embedded in the Iterator object. When an iterator operation is performed, the iterator method compares the list counter with the Iterator counter, and throws ConcurrentModificationException if they differ. That explains why the first code snippet above throws ConcurentModificationException. On the contrary, if we use Iterator's remove method (second code snippet above), the Iterator counter is also decremented. So, in the next iteration, the counter check succeeds. Hence, the second code snippet above doesn't throw ConcurrentModificationException.

Fail-Safe iterators :

Contrary to fail-fast iterators, fail safe iterators don't throw ConcurrentModificationException. This is because they work on clone of collection rather than the original collection. Fail-safe iterators typically operate over a snapshot and allow concurrent modification. But the fail fast iterators may not reflect the updates on collection after the iterator was created. What this implies is that Fail-Fast iterator is guaranteed to list elements as they existed upon iterator construction, and may reflect any modification subsequent to construction of the iterator(without guarantee).  Iterator of CopyOnWriteArrayList and iterators of java.util.concurrent package was fail-fast iterators. Since fail-safe iterators work on cloned objects, they have extra memory overhead.

Saturday, 15 March 2014

Java 8 : Difference between List.sort and Collections.sort

One of the most anticipated programming language updates in many years is here. Java 8 would be officially released on March 18, 2014. It offers a lot of new features which are worth exploring. One of them that we would be discussing here is List.sort. The addition of List.sort(Comparator) in Java 8 is fantastic. List.sort unsurprisingly allows you to sort a list, but the neat thing about List.sort is that for most of the common list implementations (such as ArrayList), it would sort much more quickly than Collections.sort. To understand this, first we need to know how Collections.sort works internally.

Collections.sort is a stable sort which means that equal elements won't be reordered after sorting. So, quicksort and heapsort are out of question since they are not stable. The most efficient stable sorting algorithm is merge sort. So, Java internally uses a modified version of merge sort. The exact algorithm was derived from Tim Sort which was developed by Tim Peter for Python. Tim sort is a combination of insertion sort and merge sort. If the number of elements is less than a certain value, insertion sort is used. This is because in case of smaller lists, the overhead of making recursive calls in case of merge sort is greater than the performance benefit obtained by using merge sort. Hence, it makes sense to use insertion sort for smaller lists.

Collections.sort implementation first dumps the specified list into a primitive array. Then it sorts this primitive array and puts back the sorted array elements back into the list in sorted order. Now the question is why do we need to copy the list into a new primitive array ? This is because Collections can be backed up by a primitive array (as in ArrayList) or can be backed up by a linked list (as in LinkedList). Copying into a new primitive array is done to avoid n^2 * log(n) performance that would result if we try to sort linked list in place.

But if we see carefully, this copying into a new primitive array is not required in case of ArrayList and some other list implementations since these list implementations are already backed up by a primitive array. This is exactly what List.sort takes advantage of. The default implementation of List.sort still does what Collections.sort does, but concrete implementing classes are free to optimize. For example, ArrayList.sort invokes Arrays.sort on the ArrayList's internal array. Hence the overhead of allocating and deallocating a new primitive array is gone. Also overhead of copying back the sorted elements from the array is gone. This is a big performance gain. Needless to say, we are saving space as well since we are not allocating space for a new primitive array.

Performance isn't the only potential gain from these new methods. Sorting a Collections.synchronizedList is an atomic operation using List.sort. You can iterate over all the elements of a List as an atomic operation using List.forEach. This was not possible in Java 7.

Saturday, 8 March 2014

Bonds

A bond is a fixed income security i.e. the return from a bond is known when it is issued. The essential difference between a bond and a stock can be summed up in one phrase : "Debt Vs Equity". That is, bonds represent debt and stocks represent equity ownership. This difference brings us to the first main advantage of bonds : In general, investing in bonds is safer than investing in stocks. The reason for this is because of the priority that debt holders have over share holders. But not all bonds are safe. There are also very risky bonds which are known as junk bonds. If a company goes bankrupt, debt holders are ahead of shareholders in the line to be paid.

Bond Issuance :

Bonds are issued by public authorities, credit institutions, governments and companies in the primary market. The bonds are mostly issued through a process called underwriting. On the other hand, government bonds are mostly issued in an auction. After the issue, the bonds can be traded in the secondary market just like other instruments.

Important Bond terms:

Principal value or Par value or face value :

This is the amount on which the issuer pays the interest. In most of the bonds, this amount has to be paid at the maturity of the bond. In some bonds, the amount which is paid at maturity is different from the principal amount.

Maturity :

This the date on which the bond expires. The issuer of the bond has to pay back the nominal amount to the holder of the bond on this date. Most of the bonds have term upto 10 to 30 years. But there are bonds with terms upto 100 years. And there are bonds with terms upto 1 year.

Coupon :

This is the interest rate that the issuer of the bond pays to the holder of the bond. Usually it is fixed, but it can also be variable.

Yield:

Yield is the rate of interest received from investing in the bond. It is one of the most important factors that investors care about while investing in bonds. We would talk more about it later.

Credit Rating:

This is the rating given by an agency which tells us the probability that the bond issuer would pay back the promised amount at maturity.

Some major types of bonds:

1) Fixed rate bonds : As the name suggests the coupon rate of these bonds is fixed.
2) Floating rate bonds :  The coupon rate of these bonds is not fixed.
3) Zero coupon bonds : These bonds don't pay any coupon.
4) High yield bonds :  These are also known as junk bonds. These bonds are usually start up companies which need capital to expand but don't have good credit rating. So, to lure the investors these companies issue high-yield bonds which have very high coupon rate. But this comes at the cost of risk associated with these bonds.
5) Convertible bonds : These types of bonds allows the owner of a bond to exchange it for some specific number of shares.

Bond Valuation:

Bond valuation refers to valuing the present value of a bond. The fundamental principle of bond valuation is that the bond's value is equal to the present value of its expected (future) cash flows. While valuing bonds, we consider time value of money concepts. Let's see how to calculate the price of a bond. First we need to compute the present value (PV) of the bond's future cash flows.The PV is the amount that would have to be invested today to generate that future cash flow. To find the value or the price of a bond, the PV of each individual cash flow should be added.

PV at time T = (Expected cash flow in period T) /( (1 + I) to the power T)
where I is the discount rate.

Value of bond = PV @ T1 + PV @ T2 +.........+ PV @ Tn

How does the value of a bond change:

As we can see, discount rate is the only variable factor in bond price. Rest everything is fixed.

  • As the rate increases of decreases, the discount rate that is used also changes. Higher the discount rate, the lower the value of a bond; the lower the discount rate, the higher the value of the bond. 
  • If the discount rate is higher than the coupon rate, the PV will be less than par. If the discount rate is lower than the coupon rate, the PV will be higher than par value.
  • As the bond moves closer to its maturity, it's price will closer to par. At par, market price is equal to par value and yield to maturity is equal to the coupon rate. This is clear since the expected rate of return in this case would be equal to the coupon rate. When a bond is selling at a discount, it's market price is less than the par value. So, it's yield to maturity is greater than the coupon rate. On the other hand, if a bond is selling at a premium, it's market price is greater than it's par value. So, it's yield to maturity is less than the coupon rate.
  • If a bond's market prices increases, then it's yield must increase; conversely if a bond's market price decreases, then it's yield must increase.
  • If a bond's yield does not change over it's life, then the size of it's discount or premium will decrease at an increasing rate as it's life gets shorter.

Saturday, 22 February 2014

Mutex in Java : Explained using Producer Consumer Problem

A mutex is a kernel resource that provides synchronization services. Intrinsic locks or monitors in Java act as mutexes, which means that only one thread can own the lock at one time. When thread A attempts to acquire a lock held by the thread B, thread A has to wait till thread B releases the lock.

 Every Java object can intrinsically act as a lock for purposes of synchronization. These build in locks are called intrinsic locks or monitor locks. The lock is automatically acquired by the executing thread before entering the synchronized block or method and is automatically released when the control exits the synchronized block, by the normal control flow or by throwing an exception out of the synchronized block. One important point to note is that Java locks are reentrant. When a thread attempts to acquire a lock held by another thread, it has to wait, but if a thread attempts to acquire a lock held by itself, it succeeds.

As noted above, that each object in Java can act as a lock, we need to careful while guarding a shared resource using a lock. For each shared resource, that may be accessed by multiple threads, all accesses to that resource must be performed with the same lock held. There is no relation between an object's state and it's intrinsic lock. The state of an object need not be guarded by the intrinsic lock.

Let's move to Producer Consumer problem to demonstrate the use of intrinsic locks in Java. Since intrinsic locks serve as mutexes, this example demonstrates the use of mutexes as well.

Problem:
Consider two threads : Producer thread and Consumer thread. The producer's job is to generate a piece of data and put it into a buffer shared with the consumer. The consumer's job is to consume the data from the buffer . The problem is to make sure that the producer won't try to add into the buffer if it's full and that the consumer won't try to remove data from an empty buffer.

For demonstration purpose, we have used just a count variable instead of a buffer. The producer would produce only if the count is 0 and increment count to 1. Consumer would only consume if the count is 1 and set count to 0.

Here's the code:


Here are few important points that we must note :
  1. Note that wait() and notify() are called from a synchronized block. This is because of race condition. We may lose a notify signal if these methods are not called from a synchronized block.
  2. wait() and notify() methods are defined in the Object class. This is because every Java object has a monitor associated with it.
  3. Why are we waiting in a loop in both produce() and consume() methods ? This is because of spurious wakeups. It is possible for threads to wake up even if notify() or notifyAll() has not been called. This while loop is also a nice solution if you have multipe threads waiting, which are all awakened using notifyAll(), but only one of them lets the lock. The others should continue waiting.
  4. wait() method releases the acquired lock/monitor. Otherwise the above code could easily go into a deadlock.

Saturday, 18 January 2014

Java result bearing tasks : Callable and Future

Whenever we think of creating threads in Java, Runnable interface comes to mind. If we implement Runnable interface, we need to implement run method which takes no arguments and has no return type. But if we expect a value to be returned after completion of a thread, we need to look for something else. Also Runnable can't throw a checked exception.

That something else is Callable. Callable is an interface introduced in Java 5 to meet this requirement. Callable is similar to Runnable, but it returns a value. The call() method is the entry method into a Callable object and it's return type is the type parameter set in the Callable object. Please note that the call() method in Callable object throws a Checked Exception unlike run() method in Runnable which does not throw any exception.

A Future represents the result of an asynchronous computation. The result can be retrieved using get() method when the computation has completed. The call to get() method blocks until the result is computed. Runnable and Callable represent abstract computational tasks. Future represents the life cycle of a task and provides methods to test whether the task has been completed or been cancelled. It can be used to retrieve the result of a task and also cancel the task. If you want to use Future for the sake of cancellability but not provide a usable result, you declare types of the form Future<?> and return null as the result of the underlying task. FutureTask provides a base implementation of Future. You can refer to online Java documentations to know more about CallableFuture and FutureTask.

I have demonstrated the usage of Callable and Future through the following programs :

RandomStringGenerator class :


RandomStringGeneratorTest class :



Thursday, 2 January 2014

LazySet Operations in Java

LazySet is a method offered by Atomic classes in Java. The lazySet method offers significantly cheaper volatile writes for single writers. Let's first find out how lazySet method achieves this performance gain. To find out this, we need to go into the Java Memory Model (JMM). Suppose a primitive variable is shared among different CPU cores. Each core has a local cache where it can store the variable for faster access so that it doesn't have to go to the main memory every time it has to access the variable. If one thread makes a change to this variable, it won't be reflected to the other core since it has a different cache. To mitigate this problem, the Java memory model offers volatile keyword. The reads and writes on a volatile variable would by-pass the cache and go to the main memory. But this comes at a latency cost and lazySet can be used to mitigate this cost. Please note that the cache is not really bypassed, but a memory barrier is used.

The Java Atomic variables have get() and set() methods that work as reads and writes on volatile variables with the additional feature of semi-volatile write. When you change the value of an Atomic variable through the lazySet method, the value is updated in the cache but is not pushed to the main memory. It can take indefinite amount of time of the change to be pushed to the maim memory so that other threads can see the new value. Since in case of lazySet, value is not pushed to the main memory every time, writes are significantly faster. The most important use case lazySet operation is to implement asynchronous logging.

Sunday, 1 December 2013

Hyper threading and CPU core binding

When we touch the threshold to increase the performance through software, we go back to the hardware. With a fixed set of machines and cores, the only thing we can exploit is hyper-threading, and binding a specific process to a particular core and not allowing any other process to execute on that core. We often encounter this scenario in low latency trading systems domain where even micro seconds latency is not acceptable.

The video in the following link explains hyper threading in very simple terms.

Here is a nice article to get a sense of how to bind a process to a CPU core :