okhttp3源码解析(9)-Okio

前言

上两篇文章写了下okhttp3的缓存功用,里边用到了许多okio的内容,在网络请求的部分也用到了许多,我觉得仍是有必要研讨下的。一开端看okhttp3源码的时分,我也不知道怎样下手,后边发现不如直接就从最简略的运用开端,看到什么研讨什么。下面的okio我也计划这么做。

其实我再写文章前找了些材料看了下,现已有作者写了很好的文章了,读者能够先看下他人的,图文并茂。

Android IO 结构 Okio 的完成原理,究竟哪里 OK? Android IO 结构 Okio 的完成原理,怎么检测超时? 值得一用的IO神器Okio

我这就只能说是自己看源码的了解了,比较僵硬,比较无聊,仅供参考。

ps. 这篇文章时间跨度有点长了,中心慢慢写得,有点当地好啰嗦,有的当地又有点急了,写得也特别特别长,原本应该分好几篇文章的,建议按目录按需看。

Okio运用事例

和okhttp3一开端研讨相同,咱们从Okio的用法开端,我从okhttp的源码里边找了一些比如,能够先看下:

  // DiskLruCache
  synchronized void rebuildJournal() throws IOException {
    // ...
    BufferedSink writer = Okio.buffer(fileSystem.sink(journalFileTmp));
    try {
      writer.writeUtf8(MAGIC).writeByte('\n');
      writer.writeUtf8(VERSION_1).writeByte('\n');
      writer.writeDecimalLong(appVersion).writeByte('\n');
      writer.writeDecimalLong(valueCount).writeByte('\n');
      writer.writeByte('\n');
      for (Entry entry : lruEntries.values()) {
        if (entry.currentEditor != null) {
          writer.writeUtf8(DIRTY).writeByte(' ');
          writer.writeUtf8(entry.key);
          writer.writeByte('\n');
        } else {
          writer.writeUtf8(CLEAN).writeByte(' ');
          writer.writeUtf8(entry.key);
          entry.writeLengths(writer);
          writer.writeByte('\n');
        }
      }
    } finally {
      writer.close();
    }
    // ...
  }
  // Http1Codec
  private class UnknownLengthSource extends AbstractSource {
    private boolean inputExhausted;
    // ...
    @Override public long read(Buffer sink, long byteCount)
        throws IOException {
      if (byteCount < 0) throw new IllegalArgumentException("byteCount < 0: " + byteCount);
      if (closed) throw new IllegalStateException("closed");
      if (inputExhausted) return -1;
      long read = super.read(sink, byteCount);
      if (read == -1) {
        inputExhausted = true;
        endOfInput(true, null);
        return -1;
      }
      return read;
    }
    // ...
  }

找了两个比如,一个读一个写,别离来自DiskLruCache和Http1Codec,一个是对IO文件的处理,一个是对socket流的处理,okio首要也是用在这两部分。这儿的比如看起来还不是很明显,咱们再来看下okio简略的运用:

    // OKio写文件 
    private static void writeFileByOKio() {
        try (Sink sink = Okio.sink(new File(path));
             BufferedSink bufferedSink = Okio.buffer(sink)) {
            bufferedSink.writeUtf8("write" + "\n" + "success!");
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    //OKio读文件
    private static void readFileByOKio() {
        try (Source source = Okio.source(new File(path));
             BufferedSource bufferedSource = Okio.buffer(source)) {
            for (String line; (line = bufferedSource.readUtf8Line()) != null; ) {
                System.out.println(line);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

关于Java的IO来说,这儿看起来仍是比较简洁的,Java要读取个文件要一层一层的封装Stream,相似下面代码:

fileStream = new FileInputStream(path);
binStream = new BufferedInputStream(fileStream);
dataInputStream = new DataInputStream(binStream);
dataInputStream.readInt();

实践上便是BufferedSource和BufferedSink两个,把大部分活都干了,而Java的要将BufferedInputStream/BufferedOutputStream转成专门处理的类去干活,各有各的优点吧,可是关于咱们搬砖的来说okio必定用起来简略,这点就够了。

Okio类剖析

从上面比如能够看出,Okio类便是okio的一个入口,用来供给source和sink,能够看下它的源码,全是static的办法,大致归类下就三种:

  1. 生成sink
  2. 生成source
  3. 生成buffer

生成sink和source有几种办法: file、path、socket、inputStream\outputStream,其实终究都是通过stream的办法创建的:

private static Sink sink(final OutputStream out, final Timeout timeout) {
    // ...
    return new Sink() {
      @Override public void write(Buffer source, long byteCount) throws IOException {
        // 检查巨细参数是否正确
        checkOffsetAndCount(source.size, 0, byteCount);
        while (byteCount > 0) {
          // 同步超时检测
          timeout.throwIfReached();
          // segment是缓存的最小单位,运用双向链表保存,能够同享
          Segment head = source.head;
          // head.limit按注释解释,是其时segment下一个可读取字节的方位->所以相减得到长度
          int toCopy = (int) Math.min(byteCount, head.limit - head.pos);
          out.write(head.data, head.pos, toCopy);
          // 这个head都会pop了还有必要加上读取字节数吗?也或许没有pop啊!得加上
          head.pos += toCopy;
          byteCount -= toCopy;
          source.size -= toCopy;
          // 意思便是head这个segment读取完了吧
          if (head.pos == head.limit) {
            source.head = head.pop();
            // SegmentPool会暂时贮存读取完的Segment,当然呗同享出去的不会进入SegmentPool
            SegmentPool.recycle(head);
          }
        }
      }
      // 暂时忽略其他办法
    };
  }
private static Source source(final InputStream in, final Timeout timeout) {
    // ...
    return new Source() {
      @Override public long read(Buffer sink, long byteCount) throws IOException {
        if (byteCount < 0) throw new IllegalArgumentException("byteCount < 0: " + byteCount);
        if (byteCount == 0) return 0;
        try {
          timeout.throwIfReached();
          // 没有则从SegmentPool拿,又就通过双向链表移到终究(head的pre便是终究)
          Segment tail = sink.writableSegment(1);
          // Segment缓存种剩余能写入的最大空间,tail.limit下一个能写入的方位
          int maxToCopy = (int) Math.min(byteCount, Segment.SIZE - tail.limit);
          // 把数据读取到tail里边,tail.limit是offset
          int bytesRead = in.read(tail.data, tail.limit, maxToCopy);
          if (bytesRead == -1) return -1;
          // 这个是Segment下一个能写入的方位
          tail.limit += bytesRead;
          // 这个是sink的总巨细
          sink.size += bytesRead;
          return bytesRead;
        } catch (AssertionError e) {
          if (isAndroidGetsocknameError(e)) throw new IOException(e);
          throw e;
        }
      }
      // 暂时忽略其他办法
    };
  }

能够看到sink和source办法都是自己创建了一个实例进行回来,首要重写了其间的read和write办法,这两个办法里边呈现了许多许多东西,这些呈现的东西便是咱们接下来要研讨的,首要有下面几个内容:

  • Buffer,缓存
  • timeout,超时检测
  • Segment,分段缓存
  • SegmentPool,缓存池

注释解释了一些,可是仍是得体系的研讨下,下面咱们开端研讨。

Okio承继结构

在解析Okio的详细内容前,我觉得仍是有必要先看下它的承继结构,这儿拿他人的图看下,如有侵权请联络啊!

这是Java IO的结构:

okhttp3源码解析(9)-Okio

这是Okio的结构:

okhttp3源码解析(9)-Okio

这儿掉了一个Buffer类的,Buffer承继了BufferedSource和BufferedSink接口,也便是说Buffer既是sink又是source,当然它也有好多自己的功用。

根据上面的图,首要功用就在BufferedSource和BufferedSink里边,它们的详细完成则是RealBufferedSource和RealBufferedSink,关于Segment和SegmentPool的内容上面也没有,下面独立讲解。

下面内容想了挺久,怎样去安排这样一篇博文,怎样更顺畅去看源码,终究我决定由小及大的办法来讲,先从最小的segment开端,到BufferedSource和BufferedSink,终究到buffer,下面看内容。

Segment

Segment大致便是一个数据类,是双向链表的一个节点,供给了一些操作segment的的办法,下面咱们先看数据域:

  /** The size of all segments in bytes. */
  static final int SIZE = 8192;
  /** Segments will be shared when doing so avoids {@code arraycopy()} of this many bytes. */
  static final int SHARE_MINIMUM = 1024;
  final byte[] data;
  /** The next byte of application data byte to read in this segment. */
  int pos;
  /** The first byte of available data ready to be written to. */
  int limit;
  /** True if other segments or byte strings use the same byte array. */
  boolean shared;
  /** True if this segment owns the byte array and can append to it, extending {@code limit}. */
  boolean owner;
  /** Next segment in a linked or circularly-linked list. */
  Segment next;
  /** Previous segment in a circularly-linked list. */
  Segment prev;

其实注释写的十分清楚了,SIZE是segment的最大容量,SHARE_MINIMUM是同享的数组最短长度(即一个segment能被分红八份),data表明数据,pos和limit用来符号数据部分区间,shared符号是否被同享,owner表明这个segment是否拥有这个data数组(拥有才能够写、追加,被同享的不能写),pre和next是双向指针。

数据域看完了,大致便是一个数据类,下面看下功用。

同享

Segment供给了同享的操作,有两种,一种是同享data数组的同享,一种是深度拷贝,同享出去的对象彻底是一个克隆体:

  final Segment sharedCopy() {
    shared = true;
    return new Segment(data, pos, limit, true, false);
  }
  final Segment unsharedCopy() {
    return new Segment(data.clone(), pos, limit, false, true);
  }

看到这儿,应该就能了解shared和owner两个标志了吧,榜首个函数总data是数组,不会进行赋值,而是会被新的segment持有引证,深度拷贝同享的shared为false、owner为true,就表明它是一个全新的segment仅仅数据内容和仿制的相同。

segment操作办法

剩余的办法便是对segment的操作了,有点相似stack,这儿完成了pop和push办法来对链表操作,留意下这些操作是头插法。

spilt办法很有意思,咱们这儿看下,也有利于了解pop和push:

public final Segment split(int byteCount) {
    if (byteCount <= 0 || byteCount > limit - pos) throw new IllegalArgumentException();
    Segment prefix;
    // 分了两种状况,一个是直接同享了,一个是创建个新的segment
    if (byteCount >= SHARE_MINIMUM) {
      prefix = sharedCopy();
    } else {
      prefix = SegmentPool.take();
      // 将pos开端的前byteCount个数据仿制到新segment
      System.arraycopy(data, pos, prefix.data, 0, byteCount);
    }
    // 确认新segment地尾端
    prefix.limit = prefix.pos + byteCount;
    // 把其时segment越过被split的byteCount个字节,即按pos读的时分越过一部分
    pos += byteCount;
    // 头插法,新的segment放入链表头部
    prev.push(prefix);
    return prefix;
  }

SegmentPool

都提到Segment了,SegmentPool当然得提下了,这个类很短,就六十多行代码,它唯一的意图便是把用不到的segment运用起来,避免重复创建对象带来的损耗,类是线程安全的。

A collection of unused segments, necessary to avoid GC churn and zero-fill. This pool is a thread-safe static singleton.

SegmentPool就两个办法: take和recycle,三个变量,MAX_SIZE界说了SegmentPool的最大容量(64 * 1024 = 8 * 8192,即默许八个segment),next变量指向了下一个能够运用的segment,byteCount变量记载了这个SegmentPool实践运用的字节数。

take办法

咱们先来看下take办法,用来取的一个segment以供运用:

  static Segment take() {
    synchronized (SegmentPool.class) {
      if (next != null) {
        Segment result = next;
        next = result.next;
        result.next = null;
        byteCount -= Segment.SIZE;
        return result;
      }
    }
    return new Segment(); // Pool is empty. Don't zero-fill while holding a lock.
  }

这儿便是链表操作,把next取出来,next的next作为新的next保存到SegmentPool中。假如next为空,就创建一个新的Segment,这儿写在同步代码外,还加了句注释,我看得不是很懂,意思是在next为空的时分,recycle的同步代码不要往里边push值吗?

recycle办法

  static void recycle(Segment segment) {
    // 不能乱了SegmentPool的链表
    if (segment.next != null || segment.prev != null) throw new IllegalArgumentException();
    // segment被同享,其间的data还在运用,不能复用,也不会被GC收回,所以不必管
    if (segment.shared) return; // This segment cannot be recycled.
    synchronized (SegmentPool.class) {
      if (byteCount + Segment.SIZE > MAX_SIZE) return; // Pool is full.
      // 这儿按segment最大数量算,所以SegmentPool中的segment个数是整数
      byteCount += Segment.SIZE;
      segment.next = next;
      segment.pos = segment.limit = 0;
      next = segment;
    }
  }

recycle办法也很短,首要便是做了一些判别,就把新放进来的segment放到了链表头上。

BufferedSource和BufferedSink

上面提到了Buffer类承继了BufferedSource和BufferedSink,所以在讲Buffer之前,咱们先来看下这两个办法。从上面的承继结构咱们知道了,BufferedSource和BufferedSink承继自Sink和Source,但其功用由RealBufferedSource和RealBufferedSink完成。下面咱们一层一层看。

Sink和Source接口

Sink和Source便是两个最简略的接口,下面看下源码:

public interface Sink extends Closeable, Flushable {
  void write(Buffer source, long byteCount) throws IOException;
  @Override void flush() throws IOException;
  Timeout timeout();
  @Override void close() throws IOException;
}
public interface Source extends Closeable {
  long read(Buffer sink, long byteCount) throws IOException;
  Timeout timeout();
  @Override void close() throws IOException;
}

删掉了注释,实践上Sink和Source便是读写加上了timeout的超时操控和Closeable的符号,Sink由于要输出加上了Flushable接口。

BufferedSource和BufferedSink接口

BufferedSource和BufferedSink都是抽象接口,仅仅界说了一些办法,两者大体相似,也有不同,下面先来看BufferedSink。

BufferedSink接口

BufferedSink办法特别多,咱们这归类下,首要有四种:

  1. Sink接口的flush办法
  2. getter办法: buffer、outputStream
  3. 一系列的writeXXX办法,取得BufferedSink
  4. 两个emit办法,也回来BufferedSink

flush办法、buffer办法、outputStream办法都比较好了解,BufferedSink接口的首要功用就在这些writeXXX办法上,下面大致搜集下write能够写什么:

  • ByteString、byte[]、Source、String(Utf8)、codePoint(UTF-8)、
  • String(Charset)、Byte、Short(Big-Endian)、ShortLe(Little-Endian低位在低地址端)、
  • Int、IntLe、Long、LongLe、DecimalLong(保存为十进制字符串)、HexadecimalUnsignedLong(保存为十六进制字符串)

大致便是根本的类型BufferedSink都能够直接写入,这儿就不翻开讲了,毕竟功用在RealBufferedSink中完成的,再来看下两个emit办法。实践上这办法的注释写的十分清楚:

BufferedSink b0 = new Buffer();
BufferedSink b1 = Okio.buffer(b0);
BufferedSink b2 = Okio.buffer(b1);
b2.writeUtf8("hello");
assertEquals(5, b2.buffer().size());
assertEquals(0, b1.buffer().size());
assertEquals(0, b0.buffer().size());
b2.emit();
assertEquals(0, b2.buffer().size());
assertEquals(5, b1.buffer().size());
assertEquals(0, b0.buffer().size());
b1.emit();
assertEquals(0, b2.buffer().size());
assertEquals(0, b1.buffer().size());
assertEquals(5, b0.buffer().size());

这儿说什么都不如看比如来的多,关于emitCompleteSegments也有注释,可是如同更难了解一些:

/**
 * Writes complete segments to the underlying sink, if one exists. Like {@link #flush}, but
 * weaker. Use this to limit the memory held in the buffer to a single segment. Typically
 * application code will not need to call this: it is only necessary when application code writes
 * directly to this {@linkplain #buffer() sink's buffer}. <pre>{@code
 */
BufferedSink b0 = new Buffer();
BufferedSink b1 = Okio.buffer(b0);
BufferedSink b2 = Okio.buffer(b1);
b2.buffer().write(new byte[20_000]);
assertEquals(20_000, b2.buffer().size());
assertEquals(     0, b1.buffer().size());
assertEquals(     0, b0.buffer().size());
b2.emitCompleteSegments();
assertEquals( 3_616, b2.buffer().size());
assertEquals(     0, b1.buffer().size());
assertEquals(16_384, b0.buffer().size()); // This example assumes 8192 byte segments.

结合英语了解下,大致意思便是这个emit是彻底提交,无论封装多少层,终究都是提交到原始的Buffer上,可是这个提交是看segment的,只有完整的segment才会被emit,及抵达SIZE(8192)的,没抵达的不提交。而且这个根本是RealBufferedSink中的writeXXX就处理了,用户除非操作了原始的buffer,否则是用不着的。

BufferedSource接口

BufferedSource接口和BufferedSink接口相似,没了flush办法,也多了几种办法:

  1. 判别读取完的exhausted办法
  2. getter办法: buffer、inputStream
  3. 一系列的readXXX办法,取得数据
  4. 用来越过的skip办法
  5. 用来判别要读取数据是否超出长度的require和request办法
  6. 用来判别一行前缀的select办法
  7. 一系列的indexOf办法
  8. 用来比较规模内持平的rangeEquals办法

exhausted办法便是source内没数据了,require和request办法都是判别取的数据是否超长,request会报反常,request则是回来false,readXXX办法和上面的writeXXX相似,skip办法便是越过,没什么好讲的,这儿就讲下后边三种办法。

关于select办法,也是看注释:

/**
 * Finds the first string in {@code options} that is a prefix of this buffer, consumes it from
 * this buffer, and returns its index. If no byte string in {@code options} is a prefix of this
 * buffer this returns -1 and no bytes are consumed.
 */
Options FIELDS = Options.of(
    ByteString.encodeUtf8("depth="),
    ByteString.encodeUtf8("height="),
    ByteString.encodeUtf8("width="));
Buffer buffer = new Buffer()
    .writeUtf8("width=640\n")
    .writeUtf8("height=480\n");
assertEquals(2, buffer.select(FIELDS));
assertEquals(640, buffer.readDecimalLong());
assertEquals('\n', buffer.readByte());
assertEquals(1, buffer.select(FIELDS));
assertEquals(480, buffer.readDecimalLong());
assertEquals('\n', buffer.readByte());

这儿最好看下英文注释,一开端我以为是用来筛选的,后边发现了解错了,这个是对prefix的一种识别,回来界说在options中的index,options里边没有就回来-1。

indexOf办法也没什么好说的,便是对byte、ByteString、Element(多个匹配字符任选)的匹配,支撑fromIndex和toIndex之类的。

rangeEquals办法和indexOf相似,都是在规模内判别是否持平,不过rangeEquals办法回来的是true or false。

RealBufferedSource和RealBufferedSink

尽管说RealBufferedSource和RealBufferedSink终究完成了BufferedSource和BufferedSink接口,可是翻开源码,咱们却能够发现里边的操作根本都是交给了Buffer去处理的,Buffer咱们下一节看,这儿咱们就来看下这两个类加了Real之后有什么不同。

RealBufferedSource

这儿先来看下数据域和构造函数:

  public final Buffer buffer = new Buffer();
  public final Source source;
  boolean closed;
  RealBufferedSource(Source source) {
    if (source == null) throw new NullPointerException("source == null");
    this.source = source;
  }

RealBufferedSource内保存了外部传来的source,buffer是直接new出来的,closed变量是通过调用close办法操控的,当其被符号为true的时分就不能运用了:

  @Override public void close() throws IOException {
    if (closed) return;
    closed = true;
    source.close();
    buffer.clear();
  }
  @Override public boolean isOpen() {
    return !closed;
  }

这儿的close办法是从Channel接口过来的,Channel还有个办法isOpen,其实Channel这个接口是从BufferedSource来的:

RealBufferedSource -> BufferedSource -> ReadableByteChannel -> Channel

这儿有个ReadableByteChannel办法,前面讲BufferedSource时,BufferedSource并没有完成它(三个办法),所以越过了,现在咱们持续看下ReadableByteChannel接口:

public interface ReadableByteChannel extends Channel {
    public int read(ByteBuffer dst) throws IOException;
}

比BufferedSource许多readXXX多了一个read办法,而且这个是从ByteBuffer读取的,这个ByteBuffer很长,咱们只需知道它是一个Buffer就行了。

public abstract class ByteBuffer extends Buffer implements Comparable<ByteBuffer> { ... }

看完从ReadableByteChannel和Channel来的办法,RealBufferedSource就剩余来自Source和BufferedSource接口的办法了,并没有其自己的额外办法。

下面咱们挑个办法讲下,首要仍是得看Buffer的内容:

  @Override public long read(Buffer sink, long byteCount) throws IOException {
    if (sink == null) throw new IllegalArgumentException("sink == null");
    if (byteCount < 0) throw new IllegalArgumentException("byteCount < 0: " + byteCount);
    if (closed) throw new IllegalStateException("closed");
    // buffer没数据,向buffet里边读取一个segment的长度
    if (buffer.size == 0) {
      long read = source.read(buffer, Segment.SIZE);
      if (read == -1) return -1;
    }
    // 考虑下byteCount > buffer.size的时分,按buffer.size读,所以read在外部调用的时分需求调用多次(套循环)
    long toRead = Math.min(byteCount, buffer.size);
    // 从buffer读取到sink中去
    return buffer.read(sink, toRead);
  }

这个是承继自Source的办法,其他的也大差不差,都是先判别下,然后交给buffer去做,读者能够看下源码,我这就不写了。

RealBufferedSink

RealBufferedSink和上面介绍的RealBufferedSource根本一模相同,仅仅source换成了sink,终究也是通过Buffer去完成的,这儿就看下几个不相同的办法。

emitCompleteSegments办法在BufferedSink中提到过,意思是提交写满了数据的segment到最底层的sink,它会在每个writeXXX的终究调用,这儿看下它的详细完成:

  @Override public BufferedSink writeHexadecimalUnsignedLong(long v) throws IOException {
    if (closed) throw new IllegalStateException("closed");
    buffer.writeHexadecimalUnsignedLong(v);
    // writeXXX的终究都会调用
    return emitCompleteSegments();
  }
  @Override public BufferedSink emitCompleteSegments() throws IOException {
    if (closed) throw new IllegalStateException("closed");
    // 得到的是segment最大值的整数倍
    long byteCount = buffer.completeSegmentByteCount();
    // 所以这儿写入到sink的是写满的segment
    if (byteCount > 0) sink.write(buffer, byteCount);
    return this;
  }

flush办法是RealBufferedSource中没有的,咱们这儿也看下:

  @Override public void flush() throws IOException {
    if (closed) throw new IllegalStateException("closed");
    if (buffer.size > 0) {
      // 悉数写入到sink
      sink.write(buffer, buffer.size);
    }
    // 改写
    sink.flush();
  }

close办法也有改变,留意下:

@Override public void close() throws IOException {
    if (closed) return;
    // Emit buffered data to the underlying sink. If this fails, we still need
    // to close the sink; otherwise we risk leaking resources.
    Throwable thrown = null;
    try {
      // 先把缓存数据写入再关闭,出错了会丢失数据也没办法
      if (buffer.size > 0) {
        sink.write(buffer, buffer.size);
      }
    } catch (Throwable e) {
      thrown = e;
    }
    try {
      sink.close();
    } catch (Throwable e) {
      if (thrown == null) thrown = e;
    }
    closed = true;
    // 先试试关闭后,再把两个会反常的当地的反常抛出
    if (thrown != null) Util.sneakyRethrow(thrown);
  }

Buffer

讲了那么多总算到Buffer这个最中心的类了!提到buffer,为什么要用buffer,这个就涉及到计算机的操作体系了,拜访磁盘和网卡等 IO 操作需求通过体系调用来履行,频频调用操作体系会带来上下文切换的功用损耗。以读取为例,一次性读取一些数据并保存起来(buffer),在需求的时分从buffer里读取,这样就能下降功用损耗。 当然,这是我总结的话,或许不太准,大致意思便是这样。

其实在Java里边也用到了buffer,也便是BufferedInputStream和BufferedOutputStream,里边是通过一个byte数组完成的,Okio的buffer则不相同,里边通过Segment双向链表完成buffer,并且buffer还支撑同享(上面提到了,能够同享segment中的data数组),同享能够减少双流仿制时不必要的开销。

Buffer的结构

Buffer有两千多行,一个一个办法去讲有点不现实,和上面相同,咱们先看Buffer承继的办法,再来看它自己的办法,并对它自己的办法分类,这样就能够把Buffer看成如下结构:

  • Source
  • Sink
  • Channel
  • ReadableByteChannel
  • WritableByteChannel
  • BufferedSource
  • BufferedSink
  • 内部类: UnsafeCursor,及四个获取UnsafeCursor的办法
  • 加密: md5、sha1等
  • snapshot: 取得缓存数据的ByteString
  • copy、write、read、getByte(pos)、clear、skip、rangeEquals,对缓存操作
  • 获取参数: size、completeSegmentByteCount(segment剩余空间)、segmentSizes
  • selectPrefix,判别前缀格局(配合options取得index)
  • writableSegment(minimumCapacity),获取能够够容量的segment

看着有允许大,简略的就不说了,咱们下面慢慢说。

Source、Sink、三个Channel接口

先来看前五个接口的,这几个接口简略,办法数比较少,先来看下Channel接口的完成:

  @Override public boolean isOpen() { return true; }
  @Override public void close() {}

就这?没错,这两个办法是空的,而ReadableByteChannel和WritableByteChannel就有点东西了:

// ReadableByteChannel
@Override public int read(ByteBuffer sink) throws IOException {
    Segment s = head;
    if (s == null) return -1;
    // 最大读入其时segment能剩余的字节数
    int toCopy = Math.min(sink.remaining(), s.limit - s.pos);
    sink.put(s.data, s.pos, toCopy);
    s.pos += toCopy;
    size -= toCopy;     // size是Buffer剩余的size
    // 其时segment读取完,pop并存到SegmentPool去
    if (s.pos == s.limit) {
      head = s.pop();
      SegmentPool.recycle(s);
    }
    return toCopy;
  }
// WritableByteChannel
@Override public int write(ByteBuffer source) throws IOException {
    if (source == null) throw new IllegalArgumentException("source == null");
    int byteCount = source.remaining();
    int remaining = byteCount;
    while (remaining > 0) {
      // 从SegmentPool取一个能够写入的segment,至少能写一个字节
      Segment tail = writableSegment(1);
      // 最多能够写入segment剩余的字节数
      int toCopy = Math.min(remaining, Segment.SIZE - tail.limit);
      source.get(tail.data, tail.limit, toCopy);
      remaining -= toCopy;
      tail.limit += toCopy;     // limit便是下一个能够写入的方位
    }
    size += byteCount;
    return byteCount;
  }

不是很杂乱,首要就在sink.put(..)和source.get(…)办法,这个是ByteBuffer的功用,咱们后边再巧,不过也好了解。再来看下Source和Sink的详细完成:

  // Source
  @Override public long read(Buffer sink, long byteCount) {
    if (sink == null) throw new IllegalArgumentException("sink == null");
    if (byteCount < 0) throw new IllegalArgumentException("byteCount < 0: " + byteCount);
    if (size == 0) return -1L;
    if (byteCount > size) byteCount = size;
    // 便是从外来的Buffer写入到本Buffer
    sink.write(this, byteCount);
    return byteCount;
  }
  // Sink
  @Override public void write(Buffer source, long byteCount) {
    // 很长一段注释。。。goals: don't waste CPU and don't waste memory.
    if (source == null) throw new IllegalArgumentException("source == null");
    if (source == this) throw new IllegalArgumentException("source == this");
    checkOffsetAndCount(source.size, 0, byteCount);
    while (byteCount > 0) {
      // 这时分数据只在source其时segment中
      // 假如其时buffer终究一个segment能包容这些数据,那就没有必要移动去移动segment了
      // 假如其时buffer终究一个segment包容不了或许不能修正里边data,那仍是得拼接曩昔
      // 而且由于source其时segment中或许还有不需求读的数据,要把这个segment按byteCount一分为二,前面部分拼接曩昔
      // Is a prefix of the source's head segment all that we need to move?
      if (byteCount < (source.head.limit - source.head.pos)) {
        // 双向链表,head的prev便是tail啊,写入要在尾部追加
        Segment tail = head != null ? head.prev : null;
        // tail(segment)是其data一切者,即能够修正data数据
        if (tail != null && tail.owner
            // 剩余空间加上byteCount后不超过segment最大长度,limit表明下一个可写方位,pos表明下一个可读方位
            // 假如其时segment前面部分share出去了,前面部分都不能写入(0 - limit),没被share是不是或许segment前后都有空余空间?
            && (byteCount + tail.limit - (tail.shared ? 0 : tail.pos) <= Segment.SIZE)) {
          // Our existing segments are sufficient. Move bytes from source's head to our tail.
          source.head.writeTo(tail, (int) byteCount);   // 充裕的,能够写入
          source.size -= byteCount;
          size += byteCount;
          // byteCount在source和tail其时segment都允许的状况下,仿制一下就行了,这儿就over了
          return;
        } else {
          // 留意这儿是在byteCount小于source其时segment巨细的if里边,这儿把source.head分红两个segment
          // 榜首部分长byteCount,下面还在while中会进行处理
          // We're going to need another segment. Split the source's head
          // segment in two, then move the first of those two to this buffer.
          source.head = source.head.split((int) byteCount);
        }
      }
      // segment功率中心来了!直接把source的segment拼接到其时Buffer链表终究就行了
      // Remove the source's head segment and append it to our tail.
      Segment segmentToMove = source.head;
      long movedByteCount = segmentToMove.limit - segmentToMove.pos;
      // 从source中移除改segment
      source.head = segmentToMove.pop();
      if (head == null) {
        // head为空的时分直接作为链表头,首尾指针指向自己(构成闭环,tail和head都是自己)
        head = segmentToMove;
        head.next = head.prev = head;
      } else {
        // 在终究拼接就行
        Segment tail = head.prev;
        tail = tail.push(segmentToMove);
        // 消除一下拼接后中心的冗余,由于拼接后tail前一个的空余容量或许比tail的巨细更大,这时分tail是不必要的,能够被收回
        tail.compact();
      }
      // 处理下巨细记载,byteCount被修正会影响while循环
      source.size -= movedByteCount;
      size += movedByteCount;
      byteCount -= movedByteCount;
    }
  }
  // 默许不超时检测
  @Override public Timeout timeout() { return Timeout.NONE; }

Source的read办法很简略,Sink的timeout办法默许也是回来一个不超时检测的Timeout,也就Sink的write办法杂乱些,作者还在前面加了很长的注释(大致便是让别糟蹋内存、CPU、合理运用segment之类的),上面注释写的很清楚了,不过我仍是想再流通的简述下,便于了解。

当咱们从一个buffer往另一个buffer里边写的时分,由于segment的存在,咱们只需把segment从source的链表移动到sink的链表中去就行了,可是这要考虑下特殊状况,榜首个是移动曩昔的segment或许能够与前一个segment合并,需求做下合并操作; 第二个便是byteCount比较小,直接就能仿制到sink终究一个segment里边去; 第三个便是尽管byteCount比较小,能够仿制曩昔,可是sink终究一个segment没有修正data数组的能力(!owner),仿制不曩昔,只能拼接曩昔; 第四个便是source的终究一个segment或许包括两部分数据,一部分要读取的,一部分不需求读取的,这时分就要将这个segment一分为二,把要读取的那部分移动曩昔就行了;

看完这段代码,真便是牛逼!这个便是okio高效的中心吧。

BufferedSource和BufferedSink的完成

上面有一节咱们专门讲到了BufferedSource和BufferedSink,介绍了里边办法的功用,不过终究详细完成都是Buffer完成的,下面挑几个我觉得有意思的办法完成看下。

BufferedSource接口完成

readInt办法

从readXXX根本类型里边挑一个Int看一下,我觉得仍是有代表性的:

@Override public int readInt() {
    // int型四字节
    if (size < 4) throw new IllegalStateException("size < 4: " + size);
    Segment segment = head;
    int pos = segment.pos;
    int limit = segment.limit;
    // If the int is split across multiple segments, delegate to readByte().
    if (limit - pos < 4) {
      // 不行四字节,降级分红四个一字节的byte读取,通过移位合成4byte的int,readByte里边会处理segment的切换
      return (readByte() & 0xff) << 24
          |  (readByte() & 0xff) << 16
          |  (readByte() & 0xff) <<  8
          |  (readByte() & 0xff);
    }
    // 从其时segment读取四个字节
    byte[] data = segment.data;
    int i = (data[pos++] & 0xff) << 24
        |   (data[pos++] & 0xff) << 16
        |   (data[pos++] & 0xff) <<  8
        |   (data[pos++] & 0xff);
    size -= 4;
    if (pos == limit) {
      // 刚好读完一个segment
      head = segment.pop();
      SegmentPool.recycle(segment);
    } else {
      segment.pos = pos;
    }
    return i;
  }
  // 高低位反向,在嵌入式中仍是挺常见的吧
  @Override public int readIntLe() {
    return Util.reverseBytesInt(readInt());
  }

其实仍是挺简略的,特别是和上面Sink的write办法比起来,不过仍是要留意下segment的边界状况。

readDecimalLong办法

readDecimalLong办法是一个很有意思的办法,前面咱们讲过了,会把数按十进制的字符串读取,下面看下完成:

@Override public long readDecimalLong() {
    if (size == 0) throw new IllegalStateException("size == 0");
    // This value is always built negatively in order to accommodate Long.MIN_VALUE.
    long value = 0;             // 值,一致按负数算,由于Long.MIN_VALUE绝对值大一
    int seen = 0;               // 其时读了几位
    boolean negative = false;   // 负数
    boolean done = false;
    // long 的规模 -2,147,483,648 到 2,147,483,647
    long overflowZone = Long.MIN_VALUE / 10;            // 加终究一位前假如溢出了,就不需求持续下去了
    long overflowDigit = (Long.MIN_VALUE % 10) + 1;     // 单个字符对应溢出的数字,即终究一位不能大于该数
    do {
      Segment segment = head;
      byte[] data = segment.data;
      int pos = segment.pos;
      int limit = segment.limit;
      for (; pos < limit; pos++, seen++) {
        byte b = data[pos];
        if (b >= '0' && b <= '9') {
          // 规划成取负数
          int digit = '0' - b;
          // 由于是负数,所以更小便是溢出了,溢出后拿了个新Buffer拼出字符串抛出反常,假如有负号,越过负号对应字符
          // Detect when the digit would cause an overflow.
          if (value < overflowZone || value == overflowZone && digit < overflowDigit) {
            Buffer buffer = new Buffer().writeDecimalLong(value).writeByte(b);
            if (!negative) buffer.readByte(); // Skip negative sign.
            throw new NumberFormatException("Number too large: " + buffer.readUtf8());
          }
          value *= 10;
          value += digit;
        } else if (b == '-' && seen == 0) {
          // 榜首个方位或许是负号,long负数的规模比正数规模大一
          negative = true;
          overflowDigit -= 1;
        } else {
          // 第0个方位呈现了其他字符,直接便是反常
          if (seen == 0) {
            throw new NumberFormatException(
                "Expected leading [0-9] or '-' character but was 0x" + Integer.toHexString(b));
          }
          // Set a flag to stop iteration. We still need to run through segment updating below.
          done = true;
          break;
        }
      }
      if (pos == limit) {
        head = segment.pop();
        SegmentPool.recycle(segment);
      } else {
        // 读完一位
        segment.pos = pos;
      }
    } while (!done && head != null);
    // 减去读取的长度,根据负号符号对存在负数里的成果处理
    size -= seen;
    return negative ? value : -value;
  }

这儿便是一个从字符串里读取long型数据的功用,里边验证数据规模的逻辑仍是挺有意思的,这个负数和减一用的很奇妙。

read(byte[]…)办法

上面讲到了从ReadableByteChannel承继来的read(ByteBuffer)办法,可是关于Buffer里边大部分的read办法其实都是从下面办法得到的成果:

  @Override public int read(byte[] sink, int offset, int byteCount) {
    checkOffsetAndCount(sink.length, offset, byteCount);
    Segment s = head;
    if (s == null) return -1;
    // 得到head能够仿制的容量
    int toCopy = Math.min(byteCount, s.limit - s.pos);
    System.arraycopy(s.data, s.pos, sink, offset, toCopy);
    s.pos += toCopy;
    size -= toCopy;
    if (s.pos == s.limit) {
      head = s.pop();
      SegmentPool.recycle(s);
    }
    return toCopy;
  }
indexOf办法

查找一般很考验算法,okio这儿正好有一个,那仍是得好好研讨下。

@Override public long indexOf(ByteString bytes, long fromIndex) throws IOException {
    if (bytes.size() == 0) throw new IllegalArgumentException("bytes is empty");
    if (fromIndex < 0) throw new IllegalArgumentException("fromIndex < 0");
    Segment s;
    long offset;
    // TODO(jwilson): extract this to a shared helper method when can do so without allocating.
    findSegmentAndOffset: {
      // Pick the first segment to scan. This is the first segment with offset <= fromIndex.
      s = head;
      if (s == null) {
        // No segments to scan!
        return -1L;
        // 榜首步,先找到对应fromIndex的segment,二分法,哪边更小从哪边操作
      } else if (size - fromIndex < fromIndex) {
        // We're scanning in the back half of this buffer. Find the segment starting at the back.
        offset = size;
        // 从终究一个segment开端算,offset从size开端减去这个segment的长度,假如offset更小了,s便是fromIndex对应的segment
        while (offset > fromIndex) {
          s = s.prev;
          offset -= (s.limit - s.pos);
        }
      } else {
        // We're scanning in the front half of this buffer. Find the segment starting at the front.
        offset = 0L;
        // 和前面相似,nextOffset每次加上一个segment长度,假如加上后比formIndex更大了,那s便是formIndex对应的segment
        for (long nextOffset; (nextOffset = offset + (s.limit - s.pos)) < fromIndex; ) {
          s = s.next;
          offset = nextOffset;
        }
      }
    }
    // 第二步,从对应的segment中寻觅
    // Scan through the segments, searching for the lead byte. Each time that is found, delegate to
    // rangeEquals() to check for a complete match.
    byte b0 = bytes.getByte(0);
    int bytesSize = bytes.size();
    // 被查数据榜首个字符所在方位的最大值,offset是其时segment榜首个方位(指在Buffer中)
    long resultLimit = size - bytesSize + 1;
    // 榜首层循环意图是让查找能够通过多个segment
    while (offset < resultLimit) {
      // Scan through the current segment.
      byte[] data = s.data;
      // 在该segment中的限制,第二个参数指在该segment中序号,那就不能大于segment长度
      int segmentLimit = (int) Math.min(s.limit, s.pos + resultLimit - offset);
      // 换算到segment内对应的pos(fromIndex - offset是偏移量,s.pos是segment内数据开端方位)
      for (int pos = (int) (s.pos + fromIndex - offset); pos < segmentLimit; pos++) {
        // 假如榜首个字符匹配上了,就用rangeEquals去比对后边的数据,假如匹配上了就把在Buffer里边的方位回来回去
        // rangeEquals内会跨越segment进行查找
        if (data[pos] == b0 && rangeEquals(s, pos + 1, bytes, 1, bytesSize)) {
          // offset + (pos - s.pos),offset是segment在Buffer的方位,加上数据在其时segment的便宜
          return pos - s.pos + offset;
        }
      }
      // Not in this segment. Try the next one.
      offset += (s.limit - s.pos);
      fromIndex = offset;
      s = s.next;
    }
    return -1L;
  }

这儿看得有允许晕了,首要得明确Buffer中的方位(offset、fromIndex)和在segment中的方位(pos、limit),首要根据fromIndex确认从哪个segment开端比较,确认后再运用segment中的方位进行比较榜首位,榜首位比较对上了,就调用rangeEquals进行下一步比较,它里边会支撑跨segment比较,成功了会回来true or false,大致便是这样。

rangeEquals办法

既然上面提到了rangeEquals办法,咱们这儿也看下,比较简略:

private boolean rangeEquals(
      Segment segment, int segmentPos, ByteString bytes, int bytesOffset, int bytesLimit) {
    // 其时segment终究方位
    int segmentLimit = segment.limit;
    byte[] data = segment.data;
    // 比较到bytes终究一位就算成功了
    for (int i = bytesOffset; i < bytesLimit; ) {
      // 抵达segment终究,进入下一个segment比较
      if (segmentPos == segmentLimit) {
        segment = segment.next;
        data = segment.data;
        segmentPos = segment.pos;
        segmentLimit = segment.limit;
      }
      if (data[segmentPos] != bytes.getByte(i)) {
        return false;
      }
      segmentPos++;
      i++;
    }
    return true;
  }

BufferedSink接口完成

看完BufferedSource接口的完成,咱们再来看下BufferedSink接口的完成,和上面相同,咱们选几个办法看下。

write(ByteString)办法

前面BufferedSource接口中的readByteString比较简略,咱们直接越过了,可是BufferedSink接口的ByteString仍是有点东西的,值得一看:

  @Override public Buffer write(ByteString byteString) {
    if (byteString == null) throw new IllegalArgumentException("byteString == null");
    byteString.write(this);
    return this;
  }
  // ByteString内write办法
  void write(Buffer buffer) {
    buffer.write(data, 0, data.length);
  }
  // Buffer的办法
  @Override public Buffer write(byte[] source, int offset, int byteCount) {
    if (source == null) throw new IllegalArgumentException("source == null");
    checkOffsetAndCount(source.length, offset, byteCount);
    int limit = offset + byteCount;
    while (offset < limit) {
      Segment tail = writableSegment(1);
      int toCopy = Math.min(limit - offset, Segment.SIZE - tail.limit);
      System.arraycopy(source, offset, tail.data, tail.limit, toCopy);
      offset += toCopy;
      tail.limit += toCopy;
    }
    size += byteCount;
    return this;
  }

这儿ByteString居然反过来又调用Buffer的办法进行写入,这不是多此一举么?其实仍是有意义的,这儿ByteString的数据没有向外供给,而是通过外部对象把自己数据写到外部,这样更安全吧。

Buffer write(byte[] source, offset, byteCount)办法

上面提到了Buffer的write(byte[] source, offset, byteCount)办法,这儿也说一下,实践便是供给一个数组去写入,很实用。许多当地都是通过调用它完成功用的,比如RealBufferedSink中、outputStream办法中。

在okhttp源码中用了大量的write(byte[] source),其终究完成也是在这儿:

  @Override public Buffer write(byte[] source) {
    if (source == null) throw new IllegalArgumentException("source == null");
    return write(source, 0, source.length);
  }
writeUtf8和writeUtf8CodePoint办法

关于UTF-8的写入比较杂乱,这儿就用writeUtf8简略讲讲。我也是现找材料学的,比必定正确,可是能多学点东西总有优点:

@Override public Buffer writeUtf8(String string, int beginIndex, int endIndex) {
    // 反常处理。。。忽略
    // UTF-16任何字符对应的数字都用两个字节(65536)来保存,UTF-8有或许是用一个字节表明一个字符,也或许是两个,三个,最多四个
    // UTF-8: 0xxxxxxx(1byte,128,ascii码),110xxxxx 10xxxxxx(2byte,2048),1110xxxx 10xxxxxx 10xxxxxx(3byte,65536)
    // UTF-16没有标志位,容错性高; UTF-8常用于网络传输,UTF-16用于贮存
    // Transcode a UTF-16 Java String to UTF-8 bytes. 将UTF-16转为UTF-8
    for (int i = beginIndex; i < endIndex;) {
      // charAt会按字符取,即c是两字节得UTF-16(65536)
      int c = string.charAt(i);
      // 转成1byte的UTF-8,即ASCII码
      if (c < 0x80) {
        Segment tail = writableSegment(1);
        byte[] data = tail.data;
        // 其时segment中其时字节得方位
        int segmentOffset = tail.limit - i;
        // 其时segment能包容得字符个数
        int runLimit = Math.min(endIndex, Segment.SIZE - segmentOffset);
        // Emit a 7-bit character with 1 byte.
        data[segmentOffset + i++] = (byte) c; // 0xxxxxxx
        // 试着持续当ASCII码写,提高功率
        // Fast-path contiguous runs of ASCII characters. This is ugly, but yields a ~4x performance
        // improvement over independent calls to writeByte().
        while (i < runLimit) {
          c = string.charAt(i);
          if (c >= 0x80) break;
          data[segmentOffset + i++] = (byte) c; // 0xxxxxxx
        }
        int runSize = i + segmentOffset - tail.limit; // Equivalent to i - (previous i).
        tail.limit += runSize;
        size += runSize;
      // 两字节状况,[128, 2048),将c写入到两个byte中去
      } else if (c < 0x800) {
        // xxxx xxx(x xx)(xx xxxx)
        // Emit a 11-bit character with 2 bytes.
        writeByte(c >>  6        | 0xc0); // 110xxxxx
        writeByte(c       & 0x3f | 0x80); // 10xxxxxx
        i++;
      // 三字节状况
      } else if (c < 0xd800 || c > 0xdfff) {
        // (xxxx) (xxxx xx)(xx xxxx)
        // Emit a 16-bit character with 3 bytes.
        writeByte(c >> 12        | 0xe0); // 1110xxxx
        writeByte(c >>  6 & 0x3f | 0x80); // 10xxxxxx
        writeByte(c       & 0x3f | 0x80); // 10xxxxxx
        i++;
      // 四个字节状况。
      // 假如字符是署理对,则它被编码为四个字节的序列。署理对是一对Unicode字符,用于表明UTF-16编码中BMP之外的字符。
      } else {
        // c is a surrogate. Make sure it is a high surrogate & that its successor is a low
        // surrogate. If not, the UTF-16 is invalid, in which case we emit a replacement character.
        int low = i + 1 < endIndex ? string.charAt(i + 1) : 0;
        if (c > 0xdbff || low < 0xdc00 || low > 0xdfff) {
          writeByte('?');
          i++;
          continue;
        }
        // UTF-16 high surrogate: 110110xxxxxxxxxx (10 bits)
        // UTF-16 low surrogate:  110111yyyyyyyyyy (10 bits)
        // Unicode code point:    00010000000000000000 + xxxxxxxxxxyyyyyyyyyy (21 bits)
        int codePoint = 0x010000 + ((c & ~0xd800) << 10 | low & ~0xdc00);
        // Emit a 21-bit character with 4 bytes.
        writeByte(codePoint >> 18        | 0xf0); // 11110xxx
        writeByte(codePoint >> 12 & 0x3f | 0x80); // 10xxxxxx
        writeByte(codePoint >>  6 & 0x3f | 0x80); // 10xxyyyy
        writeByte(codePoint       & 0x3f | 0x80); // 10yyyyyy
        i += 2;
      }
    }
    return this;
  }

有点懵逼,UTF-8的格局找了找材料,三个字节和四个字节的不是很清楚,可是大致功用好了解,便是把UTF-16的字符转成UTF-8的字符,运用charAt能得到字符的int型,然后根据规模去生成UTF-8,或许是一个字符,或许是两个字符,乃至三个四个字符,按格局生成字节就行了。

writeInt办法

对根本类型的写入比较简略,按字节写入segment的data里边就行了,没什么好说的。

@Override public Buffer writeInt(int i) {
    Segment tail = writableSegment(4);
    byte[] data = tail.data;
    int limit = tail.limit;
    data[limit++] = (byte) ((i >>> 24) & 0xff);
    data[limit++] = (byte) ((i >>> 16) & 0xff);
    data[limit++] = (byte) ((i >>>  8) & 0xff);
    data[limit++] = (byte)  (i         & 0xff);
    tail.limit = limit;
    size += 4;
    return this;
  }
writeDecimalLong办法

和前面readDecimalLong办法相似,

@Override public Buffer writeDecimalLong(long v) {
    if (v == 0) {
      // Both a shortcut and required since the following code can't handle zero.
      return writeByte('0');
    }
    boolean negative = false;
    if (v < 0) {
      v = -v;
      if (v < 0) { // Only true for Long.MIN_VALUE.
        return writeUtf8("-9223372036854775808");
      }
      negative = true;
    }
    // 也是凶猛哦
    // Binary search for character width which favors matching lower numbers.
    int width = //
          v < 100000000L
        ? v < 10000L
        ? v < 100L
        ? v < 10L ? 1 : 2
        : v < 1000L ? 3 : 4
        : v < 1000000L
        ? v < 100000L ? 5 : 6
        : v < 10000000L ? 7 : 8
        : v < 1000000000000L
        ? v < 10000000000L
        ? v < 1000000000L ? 9 : 10
        : v < 100000000000L ? 11 : 12
        : v < 1000000000000000L
        ? v < 10000000000000L ? 13
        : v < 100000000000000L ? 14 : 15
        : v < 100000000000000000L
        ? v < 10000000000000000L ? 16 : 17
        : v < 1000000000000000000L ? 18 : 19;
    if (negative) {
      // 负数加一位符号位
      ++width;
    }
    Segment tail = writableSegment(width);
    byte[] data = tail.data;
    // 循环取余写入,终究写入符号位
    int pos = tail.limit + width; // We write backwards from right to left.
    while (v != 0) {
      int digit = (int) (v % 10);
      data[--pos] = DIGITS[digit];
      v /= 10;
    }
    if (negative) {
      data[--pos] = '-';
    }
    tail.limit += width;
    this.size += width;
    return this;
  }

看起来还说挺好了解的,获取长度那里尽管看起来长了点,如同还没什么好办法。

Buffer本身办法

在上面现已对Buffer本身的办法做了个总结,大致如下:

  • 内部类: UnsafeCursor,及四个获取UnsafeCursor的办法
  • 加密: md5、sha1等
  • snapshot: 取得缓存数据的ByteString
  • copy、write、read、getByte(pos)、clear、skip、rangeEquals,对缓存操作
  • 获取参数: size、completeSegmentByteCount(segment剩余空间)、segmentSizes
  • selectPrefix,判别前缀格局(配合options取得index)
  • writableSegment(minimumCapacity),获取能够够容量的segment

UnsafeCursor是这个Buffer的一个操作类,比较中心,仍是得讲一下的。

加密就越过了,关于ByteString的内容能够简略讲讲,学下原理就能够了,内容好多。

对缓存的操作,不太想讲了,都是对segment的处理,在RealBufferedSource和RealBufferedSink以及Buffer内看得够多了。

获取参数的办法看一下就好,selectPrefix以及writableSegment仍是能够看下的。

内部类: UnsafeCursor,及四个获取UnsafeCursor的办法

先来看下UnsafeCursor自己的介绍吧,大致iiu是对缓存区根底数据的不安全处理器,这个安全问题需求用户自己留意,代码注释里边还有很长的比如,这儿就不贴出来了。

A handle to the underlying data in a buffer. This handle is unsafe because it does not enforce its own invariants. Instead, it assumes a careful user who has studied Okio’s implementation details and their consequences. 缓冲区中根底数据的handle。该handle是不安全的,由于它不强制履行自己的不变量。相反,它假定一个细心的用户现已研讨了 Okio 的完成细节及其后果。

变量域

先来看下变量域,先来了解下这些成员变量代表的意义:

    // 操控的buffer
    public Buffer buffer;
    // 可读可写?
    public boolean readWrite;
    // 其时segment
    private Segment segment;
    // 其时偏移,最大为buffer.size
    public long offset = -1L;
    // references the segment's internal byte array
    public byte[] data;
    // is the segment's start,在buffer里边的方位
    public int start = -1;
    // is the segment's end,在buffer里边的方位
    public int end = -1;

下面再来了解办法,next和close比较简略就不写了。

seek办法
    public final int seek(long offset) {
      if (offset < -1 || offset > buffer.size) {
        throw new ArrayIndexOutOfBoundsException(
            String.format("offset=%s > size=%s", offset, buffer.size));
      }
      // 规模外
      if (offset == -1 || offset == buffer.size) {
        this.segment = null;
        this.offset = offset;
        this.data = null;
        this.start = -1;
        this.end = -1;
        return -1;
      }
      // Navigate to the segment that contains `offset`. Start from our current segment if possible.
      long min = 0L;
      long max = buffer.size;
      Segment head = buffer.head;
      Segment tail = buffer.head;
      if (this.segment != null) {
        // 取得其时segment榜首个方位在buffer中的偏移
        long segmentOffset = this.offset - (this.start - this.segment.pos);
        // 根据其时segment方位修正min或max,缩小规模,以其时segment为界分两部分
        if (segmentOffset > offset) {
          // Set the cursor segment to be the 'end'
          max = segmentOffset;
          tail = this.segment;
        } else {
          // Set the cursor segment to be the 'beginning'
          min = segmentOffset;
          head = this.segment;
        }
      }
      // 从小的部分开端寻觅offset所在segment
      Segment next;
      long nextOffset;
      if (max - offset > offset - min) {
        // Start at the 'beginning' and search forwards
        next = head;
        nextOffset = min;
        while (offset >= nextOffset + (next.limit - next.pos)) {
          nextOffset += (next.limit - next.pos);
          next = next.next;
        }
      } else {
        // Start at the 'end' and search backwards
        next = tail;
        nextOffset = max;
        while (nextOffset > offset) {
          next = next.prev;
          nextOffset -= (next.limit - next.pos);
        }
      }
      // If we're going to write and our segment is shared, swap it for a read-write one.
      if (readWrite && next.shared) {
        // 假如该segment被同享了,就仿制一个不被同享的segment,并在下面代码代替原本的
        Segment unsharedNext = next.unsharedCopy();
        if (buffer.head == next) {
          buffer.head = unsharedNext;
        }
        next = next.push(unsharedNext);
        next.prev.pop();
      }
      // 更新Buffer的数据域
      // Update this cursor to the requested offset within the found segment.
      this.segment = next;
      this.offset = offset;
      this.data = next.data;
      this.start = next.pos + (int) (offset - nextOffset);
      this.end = next.limit;
      return end - start;
    }

seek办法不是很杂乱,意图便是根据offset找到对应的segment,并在这个segment找到offset的方位,详细看上面注释。

resizeBuffer办法
public final long resizeBuffer(long newSize) {
      if (buffer == null) {
        throw new IllegalStateException("not attached to a buffer");
      }
      if (!readWrite) {
        throw new IllegalStateException("resizeBuffer() only permitted for read/write buffers");
      }
      long oldSize = buffer.size;
      if (newSize <= oldSize) {
        if (newSize < 0) {
          throw new IllegalArgumentException("newSize < 0: " + newSize);
        }
        // 从终究一个segment开端pop并收回,直到终究一个segment正好能包容newSize
        // Shrink the buffer by either shrinking segments or removing them.
        for (long bytesToSubtract = oldSize - newSize; bytesToSubtract > 0; ) {
          Segment tail = buffer.head.prev;
          int tailSize = tail.limit - tail.pos;
          if (tailSize <= bytesToSubtract) {
            // tail被pop,回来的是tail的next,即head,或许为null
            buffer.head = tail.pop();
            SegmentPool.recycle(tail);
            bytesToSubtract -= tailSize;
          } else {
            // 把终究一个segment可写的方位缩到刚好newSize的方位
            tail.limit -= bytesToSubtract;
            break;
          }
        }
        // Seek to the end. 重置数据域
        this.segment = null;
        this.offset = newSize;
        this.data = null;
        this.start = -1;
        this.end = -1;
      } else if (newSize > oldSize) {
        // 逐一增加segment使之抵达newSize要求
        // Enlarge the buffer by either enlarging segments or adding them.
        boolean needsToSeek = true;
        for (long bytesToAdd = newSize - oldSize; bytesToAdd > 0; ) {
          // 先验证终究一个segment能写入的巨细,能完成bytesToAdd就增加bytesToAdd,否则就填满segment
          // writableSegment会验证是否写满了segment,不行会自动增加和切换到新segment
          Segment tail = buffer.writableSegment(1);
          int segmentBytesToAdd = (int) Math.min(bytesToAdd, Segment.SIZE - tail.limit);
          tail.limit += segmentBytesToAdd;
          bytesToAdd -= segmentBytesToAdd;
          // 只履行一次,一开端的时分跳到末尾segment
          // If this is the first segment we're adding, seek to it.
          if (needsToSeek) {
            this.segment = tail;
            this.offset = oldSize;
            this.data = tail.data;
            this.start = tail.limit - segmentBytesToAdd;
            this.end = tail.limit;
            needsToSeek = false;
          }
        }
      }
      buffer.size = newSize;
      return oldSize;
    }

resizeBuffer办法和它的姓名相同,便是来对buffer巨细进行操控的,会根据传入的size对segment进行收回或许创建。

expandBuffer办法
public final long expandBuffer(int minByteCount) {
      if (minByteCount <= 0) {
        throw new IllegalArgumentException("minByteCount <= 0: " + minByteCount);
      }
      if (minByteCount > Segment.SIZE) {
        throw new IllegalArgumentException("minByteCount > Segment.SIZE: " + minByteCount);
      }
      if (buffer == null) {
        throw new IllegalStateException("not attached to a buffer");
      }
      if (!readWrite) {
        throw new IllegalStateException("expandBuffer() only permitted for read/write buffers");
      }
      long oldSize = buffer.size;
      // 或许是原本终究一个segment也或许是新的segment,不过没关系
      Segment tail = buffer.writableSegment(minByteCount);
      // 终究一个segment能包容的巨细
      int result = Segment.SIZE - tail.limit;
      tail.limit = Segment.SIZE;
      // 实践扩容,result >= minByteCount
      buffer.size = oldSize + result;
      // Seek to the old size.
      this.segment = tail;
      // offset抵达oldSize时,现已在新的segment了
      this.offset = oldSize;
      this.data = tail.data;
      this.start = Segment.SIZE - result;
      this.end = Segment.SIZE;
      return result;
    }

这儿便是给一个minByteCount来扩容,看代码这个minByteCount应该需求比Segment.SIZE小,它会发生两种成果,一个便是原本的tail能够包容minByteCount,就不必切换segment了,假如原本的tail包容不了minByteCount,那就会进入到下一个segment,这时分原本tail后边能够写入的部分就会被越过,所以当seek到oldSize的时分会抵达新的segment起点处。

snapshot办法

  /** Returns an immutable copy of this buffer as a byte string. */
  public final ByteString snapshot() {
    if (size > Integer.MAX_VALUE) {
      throw new IllegalArgumentException("size > Integer.MAX_VALUE: " + size);
    }
    return snapshot((int) size);
  }
  /**
   * Returns an immutable copy of the first {@code byteCount} bytes of this buffer as a byte string.
   */
  public final ByteString snapshot(int byteCount) {
    if (byteCount == 0) return ByteString.EMPTY;
    return new SegmentedByteString(this, byteCount);
  }

这儿两个snapshot办法便是回来SegmentedByteString的,SegmentedByteString和ByteString内容还许多,这儿就简略了解下功用,不做深入探讨。

ByteString注释: An immutable sequence of bytes. Byte strings compare lexicographically as a sequence of unsigned bytes. That is, the byte string ff sorts after 00. This is counter to the sort order of the corresponding bytes, where -1 sorts before 0. 不可变的字节序列。 字节字符串按字典次序比较为无符号字节序列。也便是说,字节字符串 ff 排序在 00 之后。这与相应字节的排序次序相反,其间 -1 排序在 0 之前。

上面是ByteString自己的概述,简略来说便是ByteString是用来处理不可变的字节序列的,功率比较高,Buffer是用来处理可变序列的。ByteString内部运用byte数组贮存数据,而不是用segment,供给了许多高功率的办法来对数据处理,比如加密(base64、md5、sha1等)、substring、rangeEquals、startsWith、indexOf、lastIndexOf等,感觉便是相似String吧。

SegmentedByteString注释: An immutable byte string composed of segments of byte arrays. This class exists to implement efficient snapshots of buffers. It is implemented as an array of segments, plus a directory in two halves that describes how the segments compose this byte string. 由字节数组段组成的不可变字节字符串。此类的存在是为了完成高效的缓冲区快照。它被完成为一个段数组,加上一个分为两半的目录,描述了段怎么组成这个字节字符串。

这儿只拿了SegmentedByteString前面部分的注释,大致意思便是它是一个有segment组成的不可变的字符串,它有两个数据域: segments用于保存数据,directory用来操控数据。segments是一个二维数组,directory是个一维数组,directory分为两部分,榜首部分表明segments榜首层数组中贮存字符串的累加长度,第二部分表明segments榜首层数组中贮存字符串的开端方位,不是很好了解,这儿用注释中的比如看一下就懂了:

Suppose we have a byte string, [A, B, C, D, E, F, G, H, I, J, K, L, M] that is stored across three byte arrays: [x, x, x, x, A, B, C, D, E, x, x, x], [x, F, G], and [H, I, J, K, L, M, x, x, x, x, x, x] the complete directory is [5, 7, 13, 4, 1, 0]. 前半部分表明每个array中累加的字符串长度,后半部分表明每个array中开端的方位 需求留意的是array的次序是有序的,array中的数据是接连的

SegmentedByteString其他办法就一个toByteString需求看下,它会把数据转成ByteString,再由ByteString对外供给各种功用。

selectPrefix办法

前面BufferedSource中,咱们讲到了select(Options options)办法,便是供给一个options数组,然后对每一行数据的前缀格局进行校验,然后回来校验到的options内i的index,其时那里的index便是通过Buffer来完成的。

selectPrefix办法比较长,还和Options有关,Options也比较长,这儿就不贴代码了,可是其间的原理倒是很有学习效果,一起来看下吧!

Options类

Options类比较有意思,代码很长,但就做了两件事,一个是通过of创建Options,一个便是通过buildTrieRecursive递归生成trie树。

of办法会对传入的options进行排序去重,数据暂时保存在list里边,然后调用buildTrieRecursive去生成trie树,数据终究保存在trie树里边。这儿最好先了解下trie树,它的根节点是空字符,然后从根节点能够根据前缀找到字符串,还挺实用的,用于查找很便利。

buildTrieRecursive是一个递归的办法,能够将保存在list中的字符串转成trie树,保存的时分有必定的格局,原理差不多就这样,先来看下注释:

Builds a trie encoded as an int array. Nodes in the trie are of two types: SELECT and SCAN. SELECT nodes are encoded as: – selectChoiceCount: the number of bytes to choose between (a positive int) – prefixIndex: the result index at the current position or -1 if the current position is not a result on its own – a sorted list of selectChoiceCount bytes to match against the input string – a heterogeneous list of selectChoiceCount result indexes (>= 0) or offsets (< 0) of the next node to follow. Elements in this list correspond to elements in the preceding list. Offsets are negative and must be multiplied by -1 before being used. SCAN nodes are encoded as: – scanByteCount: the number of bytes to match in sequence. This count is negative and must be multiplied by -1 before being used. – prefixIndex: the result index at the current position or -1 if the current position is not a result on its own – a list of scanByteCount bytes to match – nextStep: the result index (>= 0) or offset (< 0) of the next node to follow. Offsets are negative and must be multiplied by -1 before being used. This structure is used to improve locality and performance when selecting from a list of options. 构建一个编码为 int 数组的 trie。 trie 中的节点有两种类型:SELECT 和 SCAN。 SELECT 节点编码为: – selectChoiceCount:要挑选的字节数(正整数) – prefixIndex:其时方位的成果索引,假如其时方位本身不是成果,则为 -1 – 排序列表要与输入字符串匹配的 selectChoiceCount 字节 – selectChoiceCount 成果索引 (>= 0) 或要跟从的下一个节点的偏移量 (< 0) 的异构列表。此列表中的元素对应于前面列表中的元素。偏移量为负数,运用前有必要乘以 -1。 SCAN 节点编码为: – scanByteCount:按次序匹配的字节数。该计数为负数,运用前有必要乘以 -1。 – prefixIndex:其时方位的成果索引,假如其时方位本身不是成果,则为 -1 – 要匹配的 scanByteCount 字节列表 – nextStep:成果索引 (>= 0) 或偏移量 (< 0)要遵从的下一个节点。偏移量为负数,运用前有必要乘以 -1。此结构用于提高从选项列表中进行挑选时的局部性和功用。

原本不想看这个buildTrieRecursive办法的,可是想想如同学源码便是来学东西的,什么最能学到东西,不便是算法吗?所以仍是研讨下:

private static void buildTrieRecursive(
      long nodeOffset,                  // 在Buffer中的偏移
      Buffer node,                      // 要写入的Buffer
      int byteStringOffset,             // 在buffer中现已按次序写入字符的偏移
      List<ByteString> byteStrings,     // 对应的options列表(通过排序和去重)
      int fromIndex,                    // 上面options列表开端坐标
      int toIndex,                      // 上面options列表完毕坐标
      List<Integer> indexes)            // 上面options列表中每个方位在options中原始坐标index
  {
    if (fromIndex >= toIndex) throw new AssertionError();
    // byteStringOffset是buffer中现已按次序写入字符的偏移,必定不能超过该规模内的字符串长度还长
    // 现已写好: abc,规模内字符: abcdefg,abcg,abcf
    for (int i = fromIndex; i < toIndex; i++) {
      if (byteStrings.get(i).size() < byteStringOffset) throw new AssertionError();
    }
    ByteString from = byteStrings.get(fromIndex);
    ByteString to = byteStrings.get(toIndex - 1);
    int prefixIndex = -1;
    // 榜首个字符串长度和偏移相同长,记载下index,越过它看剩余的
    // If the first element is already matched, that's our prefix.
    if (byteStringOffset == from.size()) {
      prefixIndex = indexes.get(fromIndex);
      fromIndex++;
      from = byteStrings.get(fromIndex);
    }
    // 终究一个和榜首个在byteStringOffset偏移上不持平,即有分叉
    if (from.getByte(byteStringOffset) != to.getByte(byteStringOffset)) {
      // If we have multiple bytes to choose from, encode a SELECT node.
      int selectChoiceCount = 1;
      // 由于现已排序好了,只需按次序两两比较,不相同了就多了一种分支
      for (int i = fromIndex + 1; i < toIndex; i++) {
        if (byteStrings.get(i - 1).getByte(byteStringOffset)
            != byteStrings.get(i).getByte(byteStringOffset)) {
          selectChoiceCount++;
        }
      }
      // 计算子部分需求的长度,先往下看懂格局,intCount(node)或许是避免node中其他字符,我觉得是0,由于每次传过来的node都是新建的
      // 加2是下面两行,(selectChoiceCount * 2)表明每个节点和它对应的index
      // Compute the offset that childNodes will get when we append it to node.
      long childNodesOffset = nodeOffset + intCount(node) + 2 + (selectChoiceCount * 2);
      // 写入其时层的分支数,正数
      node.writeInt(selectChoiceCount);
      // 写入其时部分最契合的index
      node.writeInt(prefixIndex);
      // 写入榜首个option,并按次序写入分叉的option
      // 了解一下: 便是写入一层树结构,同一层的数据必定是不相同的才会分叉,前面byteStringOffset相同的进入下一层
      for (int i = fromIndex; i < toIndex; i++) {
        byte rangeByte = byteStrings.get(i).getByte(byteStringOffset);
        if (i == fromIndex || rangeByte != byteStrings.get(i - 1).getByte(byteStringOffset)) {
          // 终究8位,即一个int
          node.writeInt(rangeByte & 0xff);
        }
      }
      // 下一层
      Buffer childNodes = new Buffer();
      int rangeStart = fromIndex;
      while (rangeStart < toIndex) {
        // 遍历一下,找到和rangeStart前byteStringOffset一致的数据,即[rangeStart, rangeEnd)
        byte rangeByte = byteStrings.get(rangeStart).getByte(byteStringOffset);
        int rangeEnd = toIndex;
        for (int i = rangeStart + 1; i < toIndex; i++) {
          if (rangeByte != byteStrings.get(i).getByte(byteStringOffset)) {
            rangeEnd = i;
            break;
          }
        }
        // 左闭右开,所以rangeStart便是终究一个,这儿是递归出口
        if (rangeStart + 1 == rangeEnd
            && byteStringOffset + 1 == byteStrings.get(rangeStart).size()) {
          // The result is a single index.
          node.writeInt(indexes.get(rangeStart));
        } else {
          // 负一是格局,记载下偏移到什么当地,把上面计算到的某个节点的下一层递归
          // The result is another node.
          node.writeInt((int) (-1 * (childNodesOffset + intCount(childNodes))));
          // 留意(byteStringOffset + 1),下一层的根据是下一个字符是否相同
          buildTrieRecursive(
              childNodesOffset,
              childNodes,
              byteStringOffset + 1,
              byteStrings,
              rangeStart,
              rangeEnd,
              indexes);
        }
        // 进入该层下一个节点的下面一层(前byteStringOffset相同的是一层)
        rangeStart = rangeEnd;
      }
      // 递归完毕后,把下一层的数据写到其时层的Buffer里边
      node.write(childNodes, childNodes.size());
    } else {
      // 开端到完毕在前byteStringOffset字符都相同,先找到相同的个数
      // If all of the bytes are the same, encode a SCAN node.
      int scanByteCount = 0;
      for (int i = byteStringOffset, max = Math.min(from.size(), to.size()); i < max; i++) {
        if (from.getByte(i) == to.getByte(i)) {
          scanByteCount++;
        } else {
          break;
        }
      }
      // 不是很了解,按下面写入的格局算的(2, scanByteCount, 1(终究if中两种状况都有加一))
      // intCount(node)表明什么?已有数据的长度,每次递归不都是新建的,等于0吗?保险起见?
      // Compute the offset that childNodes will get when we append it to node.
      long childNodesOffset = nodeOffset + intCount(node) + 2 + scanByteCount + 1;
      // 用负数写入相同的个数,和上面selectByteCount区别
      node.writeInt(-scanByteCount);
      node.writeInt(prefixIndex);
      // 写入前面相同的字符
      for (int i = byteStringOffset; i < byteStringOffset + scanByteCount; i++) {
        node.writeInt(from.getByte(i) & 0xff);
      }
      // 终究一个了
      if (fromIndex + 1 == toIndex) {
        // The result is a single index.
        if (byteStringOffset + scanByteCount != byteStrings.get(fromIndex).size()) {
          throw new AssertionError();
        }
        // 终究一个option对应的index
        node.writeInt(indexes.get(fromIndex));
      } else {
        // 越过scanByteCount个数据后,后边剩余的给下一层处理
        // The result is another node.
        Buffer childNodes = new Buffer();
        node.writeInt((int) (-1 * (childNodesOffset + intCount(childNodes))));
        buildTrieRecursive(
            childNodesOffset,
            childNodes,
            byteStringOffset + scanByteCount,
            byteStrings,
            fromIndex,
            toIndex,
            indexes);
        node.write(childNodes, childNodes.size());
      }
    }
  }

加了许多注释仍是不太好了解,大致来看便是按层遍历一个树,榜首层便是对榜首个字符的比较,以此类推,假定是第i层,就要对这层的数据比较第i位,每层分两种状况:这一层第i位都相同(乃至接连N层相同),这层第i位不同(有j种不同,那该节点下一层就有j个节点)。总而言之,便是该节点后边不相同了就递归,终究把数据写入到总的一个buffer中。

大致便是这样,算法这东西说清楚有点难,仍是代码好了解,上面的写入格局我也不太明白,不过没关系持续看的去,慢慢就懂了。

selectPrefix原理

看完了Options类,再回过来看selectPrefix办法,总算就不是一头雾水了,现在一切options数据都保存在了trie树里边,只需对options树进行遍历,按格局把数据读出来就行了。

int selectPrefix(Options options, boolean selectTruncated) {
    Segment head = this.head;
    if (head == null) {
      if (selectTruncated) return -2; // A result is present but truncated.
      return options.indexOf(ByteString.EMPTY);
    }
    Segment s = head;
    byte[] data = head.data;
    int pos = head.pos;
    int limit = head.limit;
    int[] trie = options.trie;
    int triePos = 0;
    int prefixIndex = -1;
    navigateTrie:
    while (true) {
      // Options中两种状况,n个字符接连相同、或许直接子树个数
      int scanOrSelect = trie[triePos++];
      // -1是初始化的默许值
      int possiblePrefixIndex = trie[triePos++];
      if (possiblePrefixIndex != -1) {
        prefixIndex = possiblePrefixIndex;
      }
      int nextStep;
      if (s == null) {
        break;
      } else if (scanOrSelect < 0) {
        // Scan形式,要先读取接连的一段字符串
        // Scan: take multiple bytes from the buffer and the trie, looking for any mismatch.
        int scanByteCount = -1 * scanOrSelect;
        int trieLimit = triePos + scanByteCount;
        // 对前面接连的scanByteCount个方位进行比较
        while (true) {
          // 取segment中数据进行比较
          int b = data[pos++] & 0xff;
          // 在scanByteCount中不相同了,就直接回来prefixIndex
          if (b != trie[triePos++]) return prefixIndex; // Fail 'cause we found a mismatch.
          boolean scanComplete = (triePos == trieLimit);
          // 切换segment
          // Advance to the next buffer segment if this one is exhausted.
          if (pos == limit) {
            s = s.next;
            pos = s.pos;
            data = s.data;
            limit = s.limit;
            // segment数据读取完了
            if (s == head) {
              if (!scanComplete) break navigateTrie; // We were exhausted before the scan completed.
              s = null; // We were exhausted at the end of the scan.
            }
          }
          // 在scanByteCount中全比对上了,就退出Scan这个代码块,进入外层下一个循环
          if (scanComplete) {
            // nextStep是下一个节点数据,而且必定不是scan形式了,可是记载的是负的偏移
            // Options中: node.writeInt((int) (-1 * (childNodesOffset + intCount(childNodes))));
            nextStep = trie[triePos];
            break;
          }
        }
      } else {
        // Select形式,selectChoiceCount是分支树,后边的数据便是这一层的数据共selectChoiceCount个
        // Select: take one byte from the buffer and find a match in the trie.
        int selectChoiceCount = scanOrSelect;
        // 遍历这一层
        int b = data[pos++] & 0xff;
        int selectLimit = triePos + selectChoiceCount;
        while (true) {
          // 这一层都匹配失利了,那就失利了啊,没有必要持续了
          if (triePos == selectLimit) return prefixIndex; // Fail 'cause we didn't find a match.
          // 匹配到这一个分支,break进入下一层
          if (b == trie[triePos]) {
            // nextStep是下一层循环的偏移,是一个负数
            // Options中: node.writeInt((int) (-1 * (childNodesOffset + intCount(childNodes))));
            nextStep = trie[triePos + selectChoiceCount];
            break;
          }
          triePos++;
        }
        // Advance to the next buffer segment if this one is exhausted.
        if (pos == limit) {
          s = s.next;
          pos = s.pos;
          data = s.data;
          limit = s.limit;
          if (s == head) {
            s = null; // No more segments! The next trie node will be our last.
          }
        }
      }
      // 失利状况现已return,nextStep上面现已提到了是负数,进入下一层循环,大于等于0的时分便是完毕的时分,存的index
      // Select时: node.writeInt(indexes.get(rangeStart));
      // Scan时: node.writeInt(indexes.get(fromIndex));
      if (nextStep >= 0) return nextStep; // Found a matching option.
      // 上面注释有,-nextStep便是下一个偏移所在方位
      triePos = -nextStep; // Found another node to continue the search.
    }
    // We break out of the loop above when we've exhausted the buffer without exhausting the trie.
    if (selectTruncated) return -2; // The buffer is a prefix of at least one option.
    return prefixIndex; // Return any matches we encountered while searching for a deeper match.
  }

麻了麻了,总算算了解清楚了。两种形式scan和select,格局如下:

  • scan榜首位是该层分支个数(正数),第二位是默许的prefixIndex(-1,树的终究一个便是正数index),然后接该层的数据,再接上这一层的数据,终究一位两种状况:下一层的偏移、每终究一个节点的index。
  • select榜首位是要越过字符的个数(负数用于区别),第二位和上面相同,然后是要越过的字符,没有数据,终究一位两种状况(下一个方位必定是select形式): 下一层的偏移、每终究一个节点的index。

然后便是这两种形式的混合,构成了整个trie树。

writableSegment办法

writableSegment办法比起上面几个内容就简略多了,直接双向链表拿终究的segment,看看容量够不行,不行从SegmentPool拿个新的放终究并回来。

/**
   * Returns a tail segment that we can write at least {@code minimumCapacity}
   * bytes to, creating it if necessary.
   */
  Segment writableSegment(int minimumCapacity) {
    if (minimumCapacity < 1 || minimumCapacity > Segment.SIZE) throw new IllegalArgumentException();
    if (head == null) {
      head = SegmentPool.take(); // Acquire a first segment.
      return head.next = head.prev = head;
    }
    Segment tail = head.prev;
    if (tail.limit + minimumCapacity > Segment.SIZE || !tail.owner) {
      tail = tail.push(SegmentPool.take()); // Append a new empty segment to fill up.
    }
    return tail;
  }