Open Source Documents: Review - ENAVIS
Info - Why am I doing this?
Acceptance Letter
On behalf of the LISA 2008 Program Committee,
Congratulations!
Your submission:
024. ENAVis: Enterprise Network Activities Visualization
was chosen to be part of the LISA 2008 Refereed papers track. Your very first
task is to reply to me within the next 7 days and confirm your acceptance.
You are assigned the following Shepherd(s) for your paper:
XXX (Removed for public posting)
Your second task is to contact your Shepherd, introduce yourself, and get any
questions out of the way.
It is important to provide the Shepherd with an interim draft as well as a final
manuscript - before the deadlines, and with suitable time for them to read and
respond. They will help you deliver a high quality final paper you can be proud
of. A good target date for an interim draft to your Shepherd is July 23rd (i.e.,
4 weeks before the August 20 final deadline). Please work with your Shepherd if
this is a difficult interim draft deadline.
Furthermore, formal detailed instructions from USENIX staff for accepted paper
authors live at:
http://www.usenix.org/events/lisa08/instrux/
Some of these instructions require filling out consent to publish forms. Please
attend to these promptly. Note the bold emphasis on the August 20, 2008 hard
deadline. I STRONGLY suggest you use August 13 instead as your internal
deadline for a final written paper, turned in to typesetting.
Enclosed are comments generated by reviewers of your paper. You may not agree
with them all, and you may see a broad range of opinions. Some may differ with
one another. Please consider these as data points and a springboard for
improving your paper.
Last but not least - thank you for being courageous enough to submit your paper
to your peers. You are now part of the LISA 2008 program, one of the select few
to have been chosen. Again, Congratulations.
Regards,
Mario Obejas
LISA 2008 Program Chair
Review 1
There is a certain novelty in their approach in that they have taken
data that has been available to us and has been well understood and
extracted this for use in a powerful extrapolation and visualization.
I would like to see a discussion around sampling rate and the accuracy
of the collected data. Take the example with the missing connection to
the ntp server, where it's quite possible that the probe schedule
consistently misses the schedule ntpd uses to query the time service.
Have the authors considered how representative the data is and
approaches get better sampling.
The statement that it's a lightweight process could be contested. If you
have something that consumes 5% of your system resources every 5
seconds, that poses a significant consumption on busy production
systems. I also would like to hear if they've encountered operational
issues, such as lsof hanging or the probes actually interrupting
services.
While it's largely a data persistence implementation detail, it'd be
interesting to see a discussion from the authors about how ENAVis could
scale, say when you go to 5-10K server environments, both from the
perspective of the infrastructure as well as the front end tools they
have developed. 500 server sampling is definitely at the lower end of
'Enterprise'.
Overall I like this paper and the project. The authors seems to have
worked from a pretty well defined idea and the paper serves as good
presentation of their work.
Review 2
I like the approach described in this paper quite well. I have some
questions about the data collection process. Have you determined how
effective polling is? Is a 5 second frequency determined through some
process? Would it be possible to switch to an event-based model instead,
perhaps using kernel level ip firewalling? How disruptive is the data
collection on users? I realize there is a relatively small amount of data
collected, however, a 5 second frequency is quite high, particularly if you
are doing any sort of parallel computation.
Does this system collect netflow data, or just depend on client-side
metrics? If not, the system would be succeptible to rootkits that replace
ps, netstat, or lsof.
Is this code released publicly? I think that it would be of great general
interest.
Review 3
The paper would be improved by a short explanation of bipartite
graph. (Best not to assume reader is versed in graph theory.)
Perhaps mentioning that the graphs have only pairwise connections
between nodes is sufficient.
The paper needs proof-reading and subsequent edits for typos and
awkwardly worded sentences. Including, but not limited to:
"pretending from"
neglectable => negligable
Therefor => Therefore
heterecious => heterogenous?
Question marks at the ends of sentences not worded as questions
The paper would be improved by alternate wordings for some trite or
colloquial phrases including:
"a picture is better than a thousand words"
"a few mouse clicks", less than `n' mouse clicks
"it collects a bunch of system-wide information"
"exciting"
"woke up" => awakened?
"HR" => Human Resources?
"the admin fires up the tool"
A reader might wonder why netstat is of use (Tier Two), when lsof
(Tier Three) is available. Doesn't lsof -i provide all the pertinent
info?
Unless a specific feature of "MySQL 5.0 Community Server" is required
(if so, mention what features), there is no need to mention such
detail as it is inconsistent with the rest of the components.
A URL is preferable to a hostname in Footnote 1.
Figure 4 is unreadable on paper. In such cases, a descriptive
caption mentioning the take-away idea is warranted.
Figure 5 would be improved by mentioning what "H", "U", and "A"
mean in the diagram or the caption.
Reference [11] (lsof) is insufficient.
Vic Abell authored lsof independently of any operating system.
Probably the author name and ftp url would suffice:
ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/
Some other useful pointers are here: http://people.freebsd.org/~abe/
The word "Java" does not need to be in typewriter font; the paper
seems to be refering to the language not the interpreter/command.
In Section VI B, the term "combo box" is used. This is an API
detail the reader is likely not familiar with.
Review 4
This is a well written paper (barring some typos/wordings) and the solution
is geared toward system administrator's perspective, which would fit very
well with LISA venue. Also, this tool has been tested for managing a real
cluster of machines (that are part of condor cluster). Figure 15 is an
interesting visual analytic example. The only concern would be it would
not be easy to spot unusual formation, if/when there are thousands of
benign clusters that are visualized as well.
We have used the perfuse library in the past and have found while it is
easy to program in Java, it has certain limitation in terms of the
scalability. If you throw in thousands nodes it only shows the first set of
clusters, which the developer should know and work around this problem by
feeding smaller sets into perfuse tool. It looks like the example shows
the expected use of ENAVis tool does not require more than a few hundreds
nodes. Also, the figure 15 is still possible given that "most" user log
into one server at a time remotely. Thus, I think, the cluster of size 2
and below can be filtered out, so that perfuse tool can pick up the cluster
that matters to sysadmins. Of course, this threshold can be configurable
as well.)
It would be nice if the author can share some of the detailed development
experiences as well. The current paper contains many good "usage" examples.
Overall, the tool will be quite valuable when it become available at a
production level quality.
Review 5
Wonderful use of the Prefuse library, especially with your additions for
graph width and node-focusing! These are essential for the tool to be of
real use.
As others have mentioned, there is quite a bit of work needed on the
exposition in English. I sense that there was a mix of non-native and
native speakers, e.g. the related work section. This should not be a
problem as long as the presenter practices long and hard!
The beginning part of the paper is a little redundant.
Review 6
The authors describe a context-sensitive visualization environment in which
one can explore the performance and behavior of hosts and users. The
environment is robust and simple and its use is supported by several
compelling case studies.
Some sentences are nonsense, e.g., "Sometimes the physically host concept
is no longer important." and "The management needs to know whether their
employees have complied with the companyís network usage policy with
regards to finance information compliance."
Spelling: heterecious => heteroecious