【源码】ArrayList源码剖析

时间：2014-08-07 23:03:35 收藏：0 阅读：394

//--------------------------------------------------------------------

转载请注明出处:http://blog.csdn.net/chdjj

by Rowandjj

2014/8/7

//--------------------------------------------------------------------

从这篇文章开始，我将对java集合框架中的一些比较重要且常用的类进行分析。这篇文章主要介绍的是ArrayList。

依然延续之前StringBuilder和StringBuffer源码分析的行文思路，首先从整体上了解java集合框架，下面就是一幅java集合框架图。

bubuko.com,布布扣

从图中可以看到，ArrayList处在这棵继承树的最底部，也就是一个叶子结点，我们要想分析ArrayList的实现逻辑，必然少不了去研究的它的超类。那么下面我们就从Collection开始，自顶向下进行分析。

public interface Collection<E> extends Iterable<E> {
    // Query Operations 查询操作
    int size();
    boolean isEmpty();
    boolean contains(Object o);
    Iterator<E> iterator();
    Object[] toArray();
    <T> T[] toArray(T[] a);
    // Modification Operations 修改操作
    boolean add(E e);
    boolean remove(Object o);
    // Bulk Operations 批量操作
    boolean containsAll(Collection<?> c);
    boolean addAll(Collection<? extends E> c);
    boolean removeAll(Collection<?> c);
    boolean retainAll(Collection<?> c);
    void clear();
    // Comparison and hashing 比较以及hash
    boolean equals(Object o);
    int hashCode();
}

Collection中主要定义了查询、修改、批量以及比较等操作规范，因为是接口，所以并无实现。

另外，此接口继承了Iterable接口，表示“可迭代的”含义，这个接口是用来返回迭代器对象的：

package java.lang;
import java.util.Iterator;
public interface Iterable<T> {
    Iterator<T> iterator();
}

而迭代器对象Iterator定义了一个遍历和删除的规范：

public interface Iterator<E> {
    boolean hasNext();
    E next();
    void remove();
}

因而，Collection的实现类都能通过iterator方法获得与自己关联的迭代器对象，然后通过Iterator定义的遍历规则进行遍历（具体的遍历方式由实现类去实现）。这保证了Collection对外的一致性，也降低了学习难度。

最顶级的接口弄明白了，我们接着分析下面的List接口和AbstractCollection抽象类。

首先是List接口：

public interface List<E> extends Collection<E> {
    // Query Operations
    int size();
    boolean isEmpty();
    boolean contains(Object o);
    Iterator<E> iterator();
    Object[] toArray();
    <T> T[] toArray(T[] a);

    // Modification Operations
    boolean add(E e);
    boolean remove(Object o);
    // Bulk Modification Operations
    boolean containsAll(Collection<?> c);
    boolean addAll(Collection<? extends E> c);
    boolean addAll(int index, Collection<? extends E> c);
    boolean removeAll(Collection<?> c);
    boolean retainAll(Collection<?> c);
    void clear();
    // Comparison and hashing
    boolean equals(Object o);
    int hashCode();
    // Positional Access Operations
    E get(int index);
    E set(int index, E element);
    void add(int index, E element);
    E remove(int index);
    // Search Operations
    int indexOf(Object o);
    int lastIndexOf(Object o);
    // List Iterators 
    ListIterator<E> listIterator();   
    ListIterator<E> listIterator(int index);
    // View
    List<E> subList(int fromIndex, int toIndex);
}

可见，List接口多了一些通过索引查找、删除，以及遍历的方法，这主要是由于List集合是有有序的，其子类要么是数组要么是链表实现，都可以通过位置去索引元素，故而增加了像get（index）这样的方法，而Set集合是无序的，故而并不需要这些方法。

另外这个类中还提供了新的迭代器，那就是ListIterator，ListIterator继承自Iterator，并增加了向前遍历、增加元素等方法：

public interface ListIterator<E> extends Iterator<E> {
    // Query Operations
    boolean hasNext();
    E next();
    boolean hasPrevious();
    E previous();
    int nextIndex();
    int previousIndex();
    // Modification Operations
    void remove();
    void set(E e);
    void add(E e);
}

下面分析AbstractCollection抽象类，注意哦，这个是抽象类，还记得之前介绍StringBuilder和StringBuffer么，StringBuilder和StringBuffer同样都继承了一个叫AbstractStringBuilder的抽象类，看命名，还真挺统一呢。

方法较多，这里挑几个重要的。首先看这个抽象类里面的抽象方法都有哪些：

 public abstract Iterator<E> iterator();
 public abstract int size();

只有两个抽象方法，这也比较正常，迭代规则应该根据具体数据结构实现，而size也依赖具体的数据结构.

再看这个contains方法：

  public boolean contains(Object o) {
        Iterator<E> it = iterator();
        if (o==null) {
            while (it.hasNext())
                if (it.next()==null)
                    return true;
        } else {
            while (it.hasNext())
                if (o.equals(it.next()))
                    return true;
        }
        return false;
    }

看到没？这个方法会依据参数是否为空，进行两次遍历，类似的还有remove等方法，这告诉我们继承AbstractCollection的子类集合中允许有空的元素！

另外很有意思的是这个add方法：

public boolean add(E e) {
        throw new UnsupportedOperationException();
    }

永远抛异常，这时因为AbstractCollection并不知道怎样去增加元素，这个方法必须被具体实现类所复写。

分析到这里，我们发现了java设计者对集合框架作了一层又一层的封装，每一层都添加了不一样的方法，并实现了能够实现的方法，提高了代码的复用性。

接下来，我们将依次分析AbstractList和ArrayList，AbstractList是个抽象类，继承了AbstractCollection，而ArrayList直接继承了AbstractList。

AbstractList与AbstractCollection的区别是它实现了一些跟ArrayList、LinkedList等有序集合的相关操作如ListIterator，因为AbstractList实现了List接口。

这个AbstractList里面有两个内部类，Itr和ListItr,这两个内部类分别实现Iterator接口和ListIterator。

先看Itr这个类，包含三个成员变量：

int cursor = 0;//游标，下一次调用next的位置
int lastRet = -1;//保存上一次next的位置
int expectedModCount = modCount;//集合被改变的次数

再看下具体实现：

  public boolean hasNext() {
            return cursor != size();//判断游标是否等于集合大小，不是则返回true
        }
        public E next() {
            checkForComodification();//检查是否集合内容被更改过
            try {
                int i = cursor;//标记此次位置
                E next = get(i);//返回该位置的元素值
                lastRet = i;//标记上一次的位置
                cursor = i + 1;//游标指向下一个位置
                return next;
            } catch (IndexOutOfBoundsException e) {
                checkForComodification();
                throw new NoSuchElementException();
            }
        }
        public void remove() {
            if (lastRet < 0)
                throw new IllegalStateException();
            checkForComodification();
            try {
                AbstractList.this.remove(lastRet);
                if (lastRet < cursor)
                    cursor--;
                lastRet = -1;
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException e) {
                throw new ConcurrentModificationException();
            }
        }
        final void checkForComodification() {//更改的次数与所期待的的次数不一致
            if (modCount != expectedModCount)//则抛出异常
                throw new ConcurrentModificationException();
        }

ListItr逻辑基本一致，在此不再敖述。

前面我们花了大量的篇幅介绍ArrayList的超类、父接口，为的是让大家对集合整个集合框架有个整体的认识，那下面呢，将进入ArrayList的源码分析。

-----------------------------------------------------------------------

先看成员变量：

private static final long serialVersionUID = 8683452581122892189L;
private transient Object[] elementData;//这即存放ArrayList元素的数组，可扩容。
private int size;//当前ArrayList的大小（实际元素数目）

再看构造器：

 public ArrayList(int initialCapacity) {//参数为初始容量
        super();
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        this.elementData = new Object[initialCapacity];
    }
    public ArrayList() {//默认容量是10
        this(10);
    }
    public ArrayList(Collection<? extends E> c) {//从集合中构造
        elementData = c.toArray();
        size = elementData.length;
        // c.toArray might (incorrectly) not return Object[] (see 6260652)
        if (elementData.getClass() != Object[].class)
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    }

跟StringBuilder/StringBuffer一样，通过构造器可以设置集合的大小,另外，ArrayList的默认大小为10。

再看扩容的相关方法，这个ensureCapacity就是扩容的入口函数，非常重要：

public void ensureCapacity(int minCapacity) {
        if (minCapacity > 0)
            ensureCapacityInternal(minCapacity);
    }
    private void ensureCapacityInternal(int minCapacity) {
        modCount++;
        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }

若所需集合的最小容量仍大于当前数组容量，那么将调用grow方法去扩容：

private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);//扩容为原来的1.5倍
        if (newCapacity - minCapacity < 0)//如果还是比最小容量小，那么干脆
            newCapacity = minCapacity;//就设置为minCapacity
        if (newCapacity - MAX_ARRAY_SIZE > 0)//看是否超过了数组容量的最大值
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }
    private static int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

每次扩容都会尝试将容量变为原来的1.5倍，然后再去跟最小容量进行比较，若比最小容量小，则直接设置为最小容量。

下面再来看几个常见的方法，首先是add，超类中的add方法一直没有具体的实现，而最终在ArrayList里面有了实现：

 public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;//一目了然
        return true;
    }

clear方法：

 public void clear() {
        modCount++;//改变次数加1
        // Let gc do its work
        for (int i = 0; i < size; i++)
            elementData[i] = null;//置空
        size = 0;//size归0
    }

remove方法：

   public boolean remove(Object o) {
        if (o == null) {
            for (int index = 0; index < size; index++)
                if (elementData[index] == null) {
                    fastRemove(index);
                    return true;
                }
        } else {
            for (int index = 0; index < size; index++)
                if (o.equals(elementData[index])) {
                    fastRemove(index);
                    return true;
                }
        }
        return false;
    }
   private void fastRemove(int index) {
        modCount++;
        int numMoved = size - index - 1;
        if (numMoved > 0)
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // Let gc do its work
    }

这里需要先对参数进行两种操作，一种是当参数为空，另一种是当参数不为空时，ArrayList是允许空值的。

另外需要注意下这个System.arraycopy方法，这个方法其实是native的，最终是通过c语言实现的，非常高效的数组复制方式。

另外ArrayList中Itr和ListItr两个内部类，跟AbstractList基本一致。

另外我们需要注意下这个toArray方法的两个重载形式，返回Object[]的版本在使用时需要注意，不能对返回的数组整体转型：

List<String> list = new ArrayList<String>();
list.add("zhangsan");
list.add("lisi");
list.add("wangwu");
//如果直接用向下转型的方法，将整个ArrayList集合转变为指定类型的Array数组，便会抛出ClassCast异常
String[] strs = (String[]) list.toArray();//错误！！！
for(int i = 0; i < strs.length; i++)
{
	System.out.println(strs[i]);
}

上面这种方式会发生异常，正确的应该是这样的：

List<String> list = new ArrayList<String>();
ist.add("zhangsan");
list.add("lisi");
list.add("wangwu");
Object[] strs = (Object[]) list.toArray();
for(int i = 0; i < strs.length; i++)
{
	System.out.println((String)strs[i]);
}

当然，你使用另一个重载版本就没有这样的问题了：

List<String> list = new ArrayList<String>();
list.add("zhangsan");
list.add("lisi");
list.add("wangwu");
String[] strs = list.toArray(new String[list.size()]);
for(int i = 0; i < strs.length; i++)
{
	System.out.println(strs[i]);
}

最后，来个总结：

1.ArrayList内部是通过一个Object数组实现的，当数组填满之后会根据需要进行扩容；

2.最好预估ArrayList的大小，并设置其初始容量，以避免不必要的扩容所造成的性能问题；

3.ArrayList的初始容量为10；

4.ArrayList每次扩容都将容量变为原来的1.5倍，若还小于所需的最小值，那么直接分配容量为所需值。

5.ArrayList允许空（null）的元素。

6.ArrayList内部有两个内部类，分别实现Iterator和ListIterator，定义了迭代的规则。

【源码】ArrayList源码剖析,布布扣,bubuko.com