EEOD - Explicit End of Data for TCP
In-progress transferring data from various sources
Over the past few years, the web has witnessed an explosion of
dynamic content generation to provide web users with an interactive
and personalized experience. While traditional web caching
techniques work well when redundancy occurs on an object-level basis
(page, image, etc.), the use of dynamic content presents unique
challenges. Although past work has addressed mechanisms for
detecting redundancy despite dynamic content, the scalability of
such techniques is limited. In this paper, we present a technique for
explicit packet boundary delineation to
enable scalable and highly efficient packet caching in the network.
Our approach, Explicit End of Data (EEOD), does not require client-side
modification and requires only minimal server-side modifications. We
demonstrate through experimental studies on an Apache web server
improvements in terms of bandwidth efficiency and retrieval time over current approaches in the literature.
Overview
Significant work has been invested by the research
community in improving the efficiency and hence scalability of
the web \cite{Squid97,CacheRef2,LoadBalanceSurveys}.
However, the emergence of dynamic content (blogs with commentary,
user preferences, etc.) creates significant difficulties for traditional
caching mechanisms.
To that end, various mechanisms
\cite{HTTP:protocol:1.1,deltaencoding,BaseCaching,TemplateCaching,ajax}
have been proposed to explicitly separate dynamic content into
cache-friendly objects. As a result, dynamic content can still be
cached in the existing object-based cache infrastructure.
Unfortunately, such mechanisms often involve significant and
non-trivial site re-design.
In contrast, recent works in \cite{PacketCache,
PartialPacketCache? ,
VBWC} adapt the caching mechanism to detect redundancy in dynamic
content without site modifications. Whole packet caching
\cite{PacketCache}, while lightweight computationally, performs
poorly when cacheable content is not aligned on a packet basis.
Although the techniques in \cite{PartialPacketCache, VBWC} do not
require packet-wise alignment, the techniques pay a significant
computational price in order to dynamically infer cacheable content
boundaries.
In essence, the approaches to dealing with the efficiency of dynamic
content can be grouped into two categories: exceptional accuracy
with heavyweight site modifications or complete avoidance of site
modifications with significant in-band computational expense. It is
the premise of this paper that a middle ground can be reached by
providing a lightweight mechanism for accurate boundary demarcation
of cacheable content.
In this paper, we introduce the concept of an Explicit End of Data (EEOD) marker.
With minimal effort, the content provider can force packet separation of cacheable
and non-cacheable content to enable highly scalable whole packet caching.
Specifically, the contributions of our paper include:
- EEOD Concept: The paper proposes the notion of an Explicit End of Data (EEOD) marker to facilitate the efficient and accurate separation at the packet level of cacheable and non-cacheable.
- Improved whole packet caching: The paper introduces an improved whole packet caching model that uses hints from EEOD combined with a novel windowed aggregation scheme.
- \emph{Prototype evaluation:} The paper completes extensive experimental studies contrasting EEOD versus existing schemes. Notably, the paper demonstrates a 25\% relative improvement in terms of bandwidth savings in addition to significantly improved scaling properties in terms of retrieval time.
- \emph{Rabin fingerprinting efficiency:} This paper is the first to highlight the poor performance of Rabin fingerprinting when minimal cacheable content exists.