University of Notre Dame NetScale Laboratory

Position Paper - Centralization and Scalability Can Co-Exist

Preface

This work was originally submitted to the HotNets VI workshop in 2007. Unfortunately, the paper was not accepted but you may find the reviews along with a detailed response for your perusal. As the paper was primarily a philosophical paper, we have wikified the document here in its original form. A PDF of the document is also attached to this wiki topic. If you would like to cite this work, please contact Dr. Striegel for the appropriate citation.

Front Matter

Title: Reducing CDNs to ABC: Scalability and Centralization Can Play Nice Together

Authors: Aaron Striegel, Dave Salyers, Yingxin Jiang, Andrew Blaich

Institution: University of Notre Dame

Support Footnote: This work was supported in part by the National Science Foundation through the grant CNS03-47392.

Abstract

The centralized approach to content distribution has long been the exclusive purview of resource-rich content providers (Google, eBay, etc.) or the mark of unpopular content. Despite its allure with regards to content customization or safeguarding sensitive user data, content providers typically employ content distribution networks (CDNs) to meet their scaling and performance needs. The mutual exclusion of centralization and scalability is nearly unquestioned dogma, i.e. distributed schemes are nearly always better. In this paper, we aim to show that centralization and scalability can indeed play nice together without application-specific proxying. We describe an approach, ScaleBox, that operates transparent to the client and network core. Furthermore, we argue that scalable, centralized distribution is indeed possible in the current Internet environment through a seamless interweaving of new and existing application-agnostic bandwidth techniques.

Introduction

In the area of content distribution, the Content Distribution Network (CDN) has emerged as an early, clear favorite. The wide scale geographic distribution of content ala Akamai offers essential capabilities for popular content, namely scalability that in turn delivers improved performance by load balancing user requests to nearby content caches. The recent explosion of P2P? mechanisms (BitTorrent? , etc.) and research \cite{CoopNet03,splitStream} echo this sentiment, namely distributing the resources is nearly always a good thing. However, the wide scale distribution of content directly conflicts with the drive towards the customization of content (Web 2.0) and the safeguarding of sensitive user data (credit card information, etc.). In short, the private user goodness (data) is what immerses the user but yet it is the least amenable to be distributed.

The ability to offer centralized content distribution remains a tantalizing but elusive option for many sites. In the absence of massive resources (ex. YouTube), web sites are consigned to the CDN route, Akamai-ze the site or choke under the requests. Hence, content sites have become good at playing the CDN game, offering the illusion of truly dynamic content through a mixture of customized and static content subject to the rendering constraints of the content viewer (web browser, media player). Conversely, other web sites simply ignore the trend towards dynamic content and wager that an explosion of popularity is not a highly probable event.

While sites such as YouTube? and Slashdot are largely forced to embrace centralized distribution, those sites exemplify why there are clouds on the horizon for the traditional CDN. Whereas CDNs tend to function best when popular content has a limited scale, the sheer breadth of content on YouTube? necessitated its embrace of centralization over the increased complexity and limited benefit of a CDN. Alternatively, highly dynamic content such as in Slashdot or other forum/feedback-driven websites (ArsTechnica, DailyKos, etc.) stress CDNs in that the core content being served is constantly changing. Furthermore, massive on-line games (World of Warcraft) exhibit similar properties as well but with centralization being significantly driven by security/quality control.

Is scaling dynamic or vast repositories a function of over-provisioning a bank account? We posit in this paper that the answer is no. While the instinctive reactions toward centralization and scaling are usually NO followed by BAD IDEA\footnote{To be fair, this has been born out countless times in practice.}, it is our premise that it is possible to achieve consistent performance across a variety of scales while still employing centralized distribution and without resorting to YouTube? -like resource escalation. Moreover, we make the scurrilous postulation that the CDN can be reduced to ABC (application-agnostic bandwidth conservation) and that furthermore, performance from the client perspective can be improved as well within the current Internet environment.

In this paper, we outline ScaleBox, a culmination of recent bandwidth conservation techniques \cite{EEOD:HotWeb06,CompNet07:PALM} coupled with novel techniques for session acceleration and latent synchronization discovery. It is the complete perspective of ScaleBox that enables the following broad claims as topics for further discussion:

  • Scaling and performance do not preclude centralization for content distribution, even without vast resources. Scaling and performance are intricately linked properties, especially with smaller content flows as poor performance (increased delay for a longer RTT and/or packet drops) rapidly decays to zero performance. In contrast, if one can achieve reductions on the order of a magnitude and beyond for popular content while accelerating content retrieval, centralization becomes significantly more appealing.
  • The approach is feasible in the near term without radical network re-design. Although radical changes to the core of the network are interesting to hypothesize, the techniques outlined in the paper do not require any changes to the core of the network nor to the end clients themselves. The changes are limited to the server (local server LAN, server application modifications are optional) and the final ISP (client-side) at the edge of their network (in-band devices acting on easily identified traffic). Moreover, while we admit that our scheme does need the involvement of the client-side ISP for benefit (deployment is optional), the economic incentives are exceptionally strong with only the two parties who benefit most being required to participate (content provider, client-side ISP).
  • Normal TCP rules should not apply when bandwidth conservation is involved. While bandwidth conservation allows a flow or session to be an extremely good netizen, there is no implicit reward beyond marginal increases in performance in most schemes. We discuss whether efficient transfers should be rewarded to match or even exceed nominal TCP fairness across constraining links.

ScaleBox Approach

ScaleBox As noted in the introduction, ScaleBox adds a group of in-band devices at the content provider and if possible, devices at the downstream local ISPs. Figure \ref{FigScaleBox} shows an overview of the ScaleBox architecture. For the purposes of simplicity, we assume that appropriate discovery mechanisms have taken place regarding the presence or lack thereof of ScaleBox devices at the local ISPs.

The goals of ScaleBox are fairly straightforward: (1) remove redundant / repeating content wherever possible; (2) remain transparent to the client and agnostic to the application; (3) accelerate / enhance connection transfer times through natural and artificial methods while preserving (2); (4) avoid any modifications to the network core. We cannot stress enough that all transformations (tokenization via packet caching, multicast aggregation via stealth multicast) are entirely hidden from the end client (i.e. end clients do not change) and largely hidden from the server application (server may choose to optionally enhance data amenability).

Packets are generated on the server through typical application serving, be those connections TCP or UDP. Upon leaving the server, the packets are directed as appropriate through the ScaleBox gauntlet of transformations (denoted by a dashed line in the figure). For the purposes of simplicity, we assume that appropriate discovery mechanisms have taken place regarding the presence or lack thereof of ScaleBox devices at the local ISP.

Packet Caching: First, packets are examined for their caching potential via an in-band packet cache (PC) \cite{EEOD:HotWeb06}. A packet that resolves to a hit in the cache table (present at the upstream and downstream packet caches) has its content replaced with a minimally sized token. The replacement with a token offers a potential $N:T$ reduction where $N$ is the original packet payload size and $T$ is the token size. The tokenized packet travels across the core of the Internet where it is reconstructed at the downstream child packet caches located at the local ISPs for the end clients. Ideally, data would be separated on a packet-wise basis via techniques such as the TCP Explicit End of Data (EEOD) \cite{EEOD:HotWeb06} (minor server application modification/kernel modification) to enable the use of whole packet caching \cite{wetherall:duppkt:USENIX} rather than the poorly scaling partial packet caching \cite{wetherall:dupRsig:SIGCOMM}.

Stealth Multicast: Upon exiting the packet cache, packets enter the stealth multicast (SM) module \cite{CompNet07:PALM}. While packet caching focuses on reducing long term redundancy through tokenization, stealth multicast focuses on reducing short term redundancy through coalescing of packets into virtual multicast groups. If an application exhibits repeated bursts of redundancy in short time periods, packets from that application are temporarily enqueued for a short time period (less than 5 ms) to allow for condensing of duplicate packet payloads into a single multicast transmission. In short, a minor amount of artificial delay imposed where jitter and traffic mixing are not likely to be an issue (close to the source), can enable significant bandwidth savings. Depending on the relative speeds of the LAN and the desired coalescing time, a reduction of $N:1$ is possible where $N$ is the number of receivers of the content. Critically, the stealth in stealth multicast comes from the fact that unicast packets are coalesced into multicast and back without changes to the server or client itself. The multicast transport mechanism itself may be any appropriate multicast mechanism (ALM \cite{CompNet07:PALM}, AMT (Automatic Multicast Tunneling), SGM \cite{SGM}, etc.). In particular, our work in \cite{CompNet07:PALM} proposes an ALM-centric version of stealth multicast (PALM - Passive Application Layer Multicast) whereby a node at the client-side ISP would serve as a ALM Helper Device (AHD) to send/replicate data and convert PALM data packets to the original data packets.

Packets vs. Bytes: TCP Fairness

TCP Pre-fetching By themselves, packet caching and stealth multicast offer significant improvements to scalability but only marginal improvements to performance. Despite the fact that the underlying connections are being extremely `good' netizens (N:T byte-wise reduction or better), the normal congestion rules of TCP dominate performance. Each tokenized packet takes nearly the same time to send as the non-tokenized case (ignoring scaling) and as a result, performance is largely determined by RTT. Contrast this with a typical CDN employing local load balancing that nominally optimizes RTT by using nearby content repository which in turn creates a noticeable performance gain for sessions on the aggregate.

In a byte-wise sense, the ScaleBox enhanced flow is performing sub-optimally when the constraining link exists inside the network and not in the last mile to the client. Hence, the final server-side component of ScaleBox is the ability to proxy ACK and/or pre-fetch data for the TCP connection, i.e. the pre-fetch device (PF). If a packet cache hit has historically exhibited subsequent packet cache hits, the next piece of data is pre-fetched from the server. Put simply, it is most likely that if one observes a hit that the next few pieces of data will also exhibit a hit (ex. picture, HTML spanning multiple packets, video/audio frame).

Figure \ref{FigPrefetch} demonstrates the performance improvements offered by pre-fetching. Rather than TCP CWND growth being dominated by end-to-end RTT, TCP CWND expansion can virtually increase through pre-fetching according to local RTT (LAN at the provider side). Note that in the ideal case, the CWND for the connection is only virtually grown (i.e. additional transmissions allowed) rather than following the standard slow start growth curve. We assume an acknowledgement of the data is required for all nominal TCP adaptations. It is the direct responsibility of the pre-fetch device to correctly adjust its pre-fetching to keep within TCP fairness constraints. Although it is possible proxy acknowledge similar to the in-network device proposed for the startup company marketing FAST TCP \cite{ToN:FASTTCP}, proxy acknowledgements are problematic due to increased complexity and buffering at the PF device. Moreover, proxy acknowledgements on any adaptive schemes beyond TCP New Reno could incur dramatic consequences.

The packet cache and stealth multicast devices are unaffected operation-wise by the PF device beyond accelerated connection transfers due to the lower RTT. Provided that the critical links in the network exist between the source and client-side ISP, the transfer is still TCP fair in a byte-wise sense. The original $N:T$ byte-wise reduction and limited scale of pre-fetching imposes at worst an $PF_{Max}T$ payload where $PF_{Max}$ is the bound on the pre-fetch size and the token size $T$ is likely extremely small. The $PF_{Max}$ setting is also critical for the downstream client as it dictates the burst that an individual client would receive from the original packets. Optionally, the downstream child packet cache could shape outbound detokenized packets with a reasonable $PF_{Max}$. A miss on the packet cache due to prefetching will be held until sufficient acknowledgements enable releasing at the server via normal transmissions. Timers for capturing the loss of pre-fetched packets are set using the normal TCP timer mechanisms.

TCP Pre-fetching Performance Figure \ref{GraphFetchPerf} notes the estimated performance of a conservative pre-fetching scheme $PF_{Max}=5$ versus normal TCP New Reno performance for 16k and 32k object retrieval ignoring the initial 3-way handshake, HTTP request, and connection closure. The graph captures the raw transfer time of the object itself which due to its small size will be dominated by the slow start mechanism. In particular, we note that even the relatively conservative pre-fetch of 5 achieves nearly a 2x improvement over the normal TCP New Reno counterparts with similar growth curves. A more aggressive pre-fetch could level out the 16k and 32k curves by capturing the entirety of the object in a single collection of tokens. Intelligent usage of the historical performance of contiguous matches (tied to a specific payload signature/application) could reduce many `mice' transfers to a single packet. When tied together with HTTP 1.1 pipelining, pre-fetching could offer even further performance improvements.

The ability of a device to request pre-fetching and to introduce a `violation' of TCP fairness brings an interesting discussion point beyond ScaleBox itself, namely \emph{should efficient, aka good netizens receive improved performance?} While this discussion has roots in multicast incentive \cite{sharedCostMCastToN}, the notion of accelerated TCP for packet caching brings an interesting twist to the discussion. In short, one can stay TCP fair in a byte-wise and even a packet-wise sense across what are likely to be the constraining links. Is the accelerated fetching during the slow start phase enough or should there be further incentive during congestion avoidance over and above the nominal steady state rate? Does the fact that one is a good netizen mean that the constraints on links outside the purview of ScaleBox, i.e. last mile, can be ignored? For instance, could one transmit a full MSS of tokens with each individual token in turn representing a full MSS? Should the pre-fetching limit ($PF_{Max}$) be tied directly to the current CWND? Conversely, what happens when a good netizen goes bad, i.e. many hits and then consecutive unique (non-cacheable) content?

Although appealing from an adoption standpoint to encourage good behavior for bandwidth conservation by violating TCP fairness, the cautious approach is likely best to avoid unforeseen consequences. In the interim, we believe that acceleration is most beneficial during the slow start phase and hence should be relegated to being either applied only during the slow start phase of a connection (easily assessed by connection start time and number of packets) or that acknowledgements of pre-fetched segments in congestion avoidance should be ignored by more aggressive algorithms (TCP Vegas, TCP Westwood, etc.). Congestion avoidance algorithms under TCP New Reno are only nominally affected as bounds on $PF_{Max}$ (the maximum number of segments to pre-fetch) prevent rapid window growth under many if not most steady state conditions.

Slashdot Is My Friend

Tail synchronization While packet caching offers an impressive $N:T$ reduction for TCP or UDP transfers, the $N:1$ reduction of stealth multicast is relegated primarily to UDP transfers or RTSP over TCP transfers (i.e. streaming media). In large part, TCP connections fail to coalesce in a short time frame, partially due to differing arrival times and partially due to per-connection variations (RTT, data alignment, etc.). Coalescing of content requests (i.e. batch serving) can offer some synchronization with popular content but offers little help for per-connection variance. We note an interesting observation in that flash crowds (i.e. the Slashdot effect) are actually a good thing in that increased arrival requests offer improved chances for synchronization. If reasonable serving capacity exists at the provider, stealth multicast performs best in the face of a flash crowd as the $N:1$ reduction is possible even with TCP for short-lived connections. While this observation is certainly not new in light of past efforts to shift popular content to push vs. pull \cite{ammar98:Pushy}, stealth multicast is able to deliver the push benefit without adapting the application to a push/group-centric approach.

The flash crowd represents a case of exceptionally popular content but an increasingly occurring case is one closer to that of YouTube? , rich multimedia content with tremendous breadth offered via Flash and TCP. The implicit synchronization offered by the flash crowd will likely not exist due in part to the length of the content and the breadth of offerings. However, the content itself is short enough that on demand does not translate to waiting for background downloading via P2P? (DVDs ala BitTorrent? ). In some sense, users have become spoiled by the speedy nature of YouTube? , acting both impatiently (limited tolerance to wait to play, limited tolerance for playback buffer exhaustion) and selfishly (better to free ride than serve).

Hence, we posit the notion of tail synchronization for such \emph{medium}-sized content, i.e. synchronization should occur from the tail and furthest from the playback buffer to maximize potential but yet mask the typical churn and security issues associated with dynamic streaming. While we do not preclude longer content, the dynamics of users that watch the entirety of the longer content \cite{ProfitVoD07} may necessitate streaming from alternative synchronization points (middle, etc.). Figure \ref{FigTailSync} captures the tail synchronization approach. Upon requesting a medium-sized object (ex. Flash movie), a timer is set to allow for coalescing of request for the same content. If additional requests arrive before the first client crosses the coalescing barrier ($CB$), the nodes are added to a virtual group. If sufficient virtual group membership exists when any client in the group crosses the $CB$, synchronized transfer begins in a separate group-oriented transfer from the tail of the stream. The stealth multicast module will then detect such synchronization and appropriately exploit said synchronization. The forward TCP transfers that populate the playback buffer continue as normal with resolution favoring the forward TCP transfer near the middle to end of the object. The net result is the discovery of latent synchronization where it exists and normal performance when it does not with any client-side support embedded directly in the server-distributed Flash control.

Coalesce Time Figure \ref{GraphCoalesce} notes the time available to coalesce connections for tail synchronization as the steady state TCP rate is varied for several recent popular YouTube? video sizes, 8 MB and 16 MB. While the 10 second short clips do not approach those magnitudes, the typical video length and hence size on the most popular video list is significantly longer. Consider a 16 MB clip based on the characteristics of 'The 305', a popular parody of the movie 300. For high-speed DSL and university access, the download speeds varied between 0.9 Mb/s and 1.3 Mb/s depending on the access time resulting in download times on the order of 1:15 to 2:18. Taking the video habits from \cite{ProfitVoD07} based on MSN Video, the video would have a strong chance in most cases of being fully downloaded before the user might consider terminating playback. In those cases, the data would still be transferred in its entirety (YouTube? Flash/TCP transfers) regardless of viewing. Moreover, a coalescing factor of 0.33 (wait to synchronize until one of the clients passes 33\% of the download), would allow for coalescing times on the order 23 to 27 seconds, offering a likely benefit to any content 16 MB or greater that is accessed more than 375 times per day (assuming a uniformly distributed access pattern for simplicity). Even a coalescing factor of 0.20 would likely yield benefits with 16 MB or greater clips accessed more than 617 times per day.

Deployment / Economic Incentive

In addition the dual benefits of improved scaling ($N:T,N:1$ potential bandwidth reductions) and improved performance (TCP acceleration via prefetching, tail synchronization), ScaleBox enjoys a relatively simple path to deployment and what we would argue is a compelling economic case for the interested parties. To start, ScaleBox embraces transparency whenever possible in its deployment coupled with the notion of discovery for downstream resources. ScaleBox will not reduce performance (functionality is enabled by local ISP deployment) and may be deployed in an incremental manner.

Critically, we posit that ScaleBox gets the economic incentive for bandwidth conservation right, namely incentive is largely independent of the core of the network. Contrast this approach with IP multicast which requires inter-domain niceties (although less so with Automatic Multicast Tunneling (AMT)), possesses difficult cost/benefit modeling (ISP and content provider), and incurs a chicken vs. egg dilemma for deployment (demand vs. development). ScaleBox offers economic incentive to the parties most likely to be subject to resource constraints, namely the content provider and the local ISP. The content provider would like to reduce bandwidth consumption and improve performance. In a similar vein, the local ISP would also like to reduce bandwidth consumption and improve performance. Through mutual agreements or even via openly competitive resources, both can achieve their goal with ScaleBox without the direct embrace of new technologies by the core of the network (i.e. the traffic is simply IP just as before). Furthermore, we also note that ScaleBox is largely protocol agnostic in that the conservation is application agnostic.

Further Discussions

Isn't this merely a camouflaged CDN? At first glance, it would appear that the notions of the child packet cache and stealth multicast helper nodes are merely a semantic twist to the CDN concept. We note two key differences: the application-agnostic nature of ScaleBox and the potential for open resource negotiation. First, ScaleBox is not tied to a specific application, be that application or protocol HTTP, FTP, Flash, BitTorrent? , or countless others. The agnostic nature of ScaleBox enables the second, namely the potential for open resource negotiation. Rather than tying replicated content to explicit agreements, ScaleBox offers the potential for fluid negotiation of resources to handle newly emerging network dynamics. The good news is that the network becomes more agile but the bad news is the implications for security (easier DoS? attacks, new DoS? avenues) which must be clearly addressed in the future.

Panacea or curse? While in an ideal world, adoption of ScaleBox would rapidly emerge from local ISPs due to its limited footprint, there is still a notable transition period while devices would be placed. Hence, it is unlikely that ScaleBox would replace CDNs but rather we would hope that CDNs would offer ScaleBox-like services in addition to their normal services. Interesting service opportunities could be constructed from the load balancing information afforded to CDNs \cite{draftAkamai} for dynamic multicast construction or re-direction of tokenized packet reconstruction. Conversely, we have largely avoided the situation of how to manage open, competitive ScaleBox resources at the local ISP. Although we would hope that resource allocation would not follow the path of work on QoS? , it is an open topic that needs further discussion.

It's clever but aren't you competing for a 2\% savings? A common perception is that insignificant opportunity exists for efficiency within the network, i.e. it is easier to throw bandwidth at the problem than to improve network efficiency. Per our paper in \cite{EEOD:HotWeb06}, we note that relatively naive optimizations on common web sites can achieve reductions 10x and beyond, despite interspersed dynamic content. Recent examinations of the university tap note naive approaches to stealth multicast can achieve peak gains of up to 80\% with the gains often occurring during the worst periods of congestion, i.e. we would perform best when things are at their worst. Similar observations were noted in \cite{Cheshire} on strictly multimedia traffic. Finally, ScaleBox also relieves pressure on the core of the network in a routing sense through data packet reduction. Rather than $N$ packets necessitating routing decisions, a consolidated tokenized packet offers a $\frac{N}{PF_{MAX}}$ reduction and stealth multicast offers a $N$-wise reduction.

Isn't cache management hard? Yes, cache replacement and distributed management is a hard problem. Previous works on packet caching including our own \cite{wetherall:dupRsig:SIGCOMM,wetherall:dupRsig:SIGCOMM,VBWC,EEOD:HotWeb06} largely ignored how to manage a distributed cache and focused on single point to point caches. We acknowledge this problem as an area for significant research exploration.

Will encryption nullify the gains when IPv6 and IPsec become prolific? While strong requirements for encryption impose restrictions on the benefits of ScaleBox, we are skeptical for several reasons. First, popular content would be much more likely to add integrity checks rather than secrecy for the purposes of speed. Integrity checks are simply additional data bits that could potentially be cached or coalesced. Second, if the requirements for confidentiality can be slightly slackened (knowing that multiple nodes received the same data but not knowing the content is tolerable), our previous work on SAABCOT \cite{SAABCOT} enables the core techniques of ScaleBox with only minimal performance penalties.

Related Work

While there has been a plethora of work on content distribution, we highlight several notable works in the limited space that are highly relevant to ScaleBox. We group the related work into two categories, namely content distribution/replication and redundant packet content elimination.

In the first area, content distribution/replication, works encompass how to replicate content quickly/reliably, wide scale distribution of streaming content, and traditional content distribution. FastReplica? \cite{FastReplica} described how to quickly disseminate content to a limited subset of nodes for future content serving. SplitStream? \cite{splitStream} and its derivatives in the peer-to-peer streaming domain describe massive streaming networks whereby downstream nodes act in the interest of the collective good of the network. Traditional content distribution networks include industry-based schemes (Akamai, Savvis, etc.), academic efforts (CoDeeN? \cite{CoBlitz}, etc.), and open-source efforts (BitTorrent? ). In contrast to solutions focusing on how to locate content \cite{dataOrientedArch07}, ScaleBox removes the need for object-wise lookup but could still benefit from overlay services directing requests to the centralized content.

Redundant content elimination traces its roots to traditional object caching \cite{CacheRef2}, peer-wise object caching \cite{Squirrel02}, and more recent efforts focusing on partial redundancy. Notably, Spring and Wetherall posited the use of Rabin fingerprinting in \cite{wetherall:dupRsig:SIGCOMM} for protocol independent redundancy extraction while other works such as VBWC (Value-Based Web Caching) \cite{VBWC} and the low-bandwidth filesystem \cite{muthitacharoen01lowbandwidth} also employed Rabin fingerprinting partial redundancy extraction at specific protocols. Cui, Kannan, and Wang took an alternative approach with Discoverer to use Needleman-Wunsch for extracting patterns to reverse engineer network protocols \cite{discover07}. Our own recent work introduced the concept of an Explicit End of Data (EEOD) for TCP to improve packet caching performance \cite{EEOD:HotWeb06} and the proposal of stealth multicast via Passive Application Layer Multicast (PALM) \cite{CompNet07:PALM}. The concept of pre-fetching has emerged in limited places with regards to multimedia \cite{prefetchKhan} but primarily addressed towards buffering improvements. Tail synchronization draws similarity from the field of parallel video and video on demand servers \cite{vodSurvey,ProfitVoD07}. Finally, we note that ScaleBox does not preclude the use of intelligent coding-based schemes \cite{rba-cyclone,digitalFountainBulk} but rather offers an interesting mechanism for introducing synchronization.

Conclusions and Future Work

In conclusion, ScaleBox offers something of an anomaly in the network world, scalability and performance despite centralized serving akin to the colloquial phrase of having one's cake and eating it too. While we do not claim that ScaleBox is a panacea for the Internet in general, it has numerous compelling points worthy of discussion by the community at large. Although the approach firmly depends on our previous work in this area, the addition of pre-fetching and tail synchronization move the idea from the realm of cute and clever to what we sincerely believe is a practical case for deployment. Moreover, deployment is not one of clean slate re-design nor massive infrastructure upgrades with economic incentives extremely well balanced for the current Internet.

Furthermore, the paper introduces interesting philosophical and research questions with regards to design: consideration of what TCP fairness means with bandwidth conservation, open versus closed resource competition, security implications, and cache management. Our on-going work includes constructing a full implementation of ScaleBox\footnote{See NetScale.cse.nd.edu for on-going development} with notable development/research issues including kernel support for server-side pre-fetching, cache management strategies, and a full emulation testbed.

References

  Attachment Action Size Date Who Comment
png Coalesce-Crop.png props, move 14.7 K 02 Oct 2007 - 20:46 AaronStriegel Graph of predicted coalescing time available
png Prefetch-2.png props, move 22.7 K 02 Oct 2007 - 20:38 AaronStriegel TCP prefetching performance
png ScaleBox.png props, move 54.0 K 02 Oct 2007 - 20:14 AaronStriegel !ScaleBox Overview
png TCP-Prefetch.png props, move 40.8 K 02 Oct 2007 - 20:37 AaronStriegel TCP prefetching for ScaleBox
png TailSync.png props, move 38.3 K 02 Oct 2007 - 20:42 AaronStriegel Tail synchronization example
r1 - 02 Oct 2007 - 21:35:33 - AaronStriegel
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
Syndicate this site RSSATOM