Skip to content →

Author: lievenlb

an even better LaTeX system

A
previous post the best LaTeX system was a commercial for Gerben
Wierda’s i-Installer to get a working tetex
distribution. I’ve been working happily with this TeX-system for two
years now but recently run into a few (minor) problems. In the process
of solving these problems I created myself a second tetex-system
more or less by accident. This is what happened. On the computer at the
university I once got fun packages running such as a chess-, go-
and Feynman diagrams-package but somehow I cannot reproduce this
on my home-machine, I get lots of errors with missing fonts etc. As I
really wanted to TeX some chess-diagrams I went surfing for the most
recent version of the chess-package and found one for Linux and
one under the Fink-project : the chess-tex package. So, I did a

sudo fink
install chess-tex

forgetting that in good Fink-tradition you can
only install a package by installing at the same time all packages
needed to run it, so I was given a whole list of packages that Fink
wanted to install including a full tetex-system. Did I want to
continue? Well, I had to think on that for a moment but realized that
the iTex-tree was living under /usr/local whereas Fink
creates trees under /sw so there should not really be a problem,
so yes let’s see what happens. It took quite a while (well over an hour
and a half) but when it was done I had a second full TeX-system, but how
could I get it running? Of course I could try to check it via the
command line but then I remembered that there is an alternative
front-end for TeXShop namely iTeXMac
which advertises that it can run either iTeX or the Fink-distribution of
tetex. So I downloaded it, looked in the preferences which indeed
contains a pane

where you can choose between using the standard
tetex-distribution or the Fink-distribution and iTeXMc finds
automatically the relevant folders. So I wrote a quick chess diagram ran
it trough iTeXMac and indeed it produced the graphics I expected! This
little system gave me some confidence in the Fink-distribution so I
fired up the Fink-Commander and looked under categories :
text
for other TeX packages I could install and there were plenty :
Latex2HTML, Latex2rtf, Feynman, tex4ht and so on. Installing them with
the commander is fun : just click on the package you want and click
‘Install’ under the Source-dropdown window

and in the lower part of the window you can follow the
installation process, whereas the upper part tells you what packages are
already installed and what their version-number is. I still have to
figure out how I will add new style files to this fink-tree and I have
to get used to the iTeXMac-editor but so far I like the robustness of
the system and the easy install procedure, so try it out!

Leave a Comment

the google matrix

This morning there was an intriguing post on arXiv/math.RA
entitled A Note on
the Eigenvalues of the Google Matrix
. At first I thought it was a
joke but a quick Google revealed that the PageRank algorithm really
is at the heart of Google technology, so I simply had to find out more
about it. An extremely readable account of it can be found in The PageRank Citation Ranking: Bringing Order to the Web which is really the
start of Google. It is coauthored by the two founders : Larry Page and
Sergey Brin. A quote from the introduction

“To test the utility of PageRank for search, we built a web
search engine called Google (Section
5)”

Here is an intuitive idea of
_PageRank_ : a page has high rank if the sum of the ranks of its
_backlinks_ (that is, pages linking to the page in question) is
high and it is computed by the _Random Surfer Model_ (see
sections 2.5 and 2.6 of the paper). More formally (at least from my
quick browsing of some papers, maybe the following account is slightly
erroneous and I’ll have to spend some more time reading) let
N be the number of webpages (estimated between 3 and 4
billion) and consider the N x N matrix
A the so called GoogleMatrix where

A = cP  + (1-c)(v x
vec(1)) 

where P is the
column-stochastic matrix (meaning : all entries are zero or positive and
the sum of all entries in each column adds up to 1) with
entries

P(i,j) = 1/N(i) if i->j and 0
otherwise 

where i and j are webpages and i->j
denotes that page i has a link to page j and where N(i) is the total
number of pages linked to in page i (all this information is available
once we download page i). c is a constant 0 < c < 1 and
corresponds to the fraction of webpages containing an _outlink (that
is, a link to another page) by all webpages (it seems that Google uses
c=0.85 as an estimate). Finally, v is a column vector with zero or
positive numbers adding up to 1 and vec(1) is the constant row vector
(1,…,1). The idea behind this term is that in the _Random Surfer
Model_ to compute the PageRank the Googlebot (normally following
links randomly in pages it enters) jumps every (1-c)x100% links randomly
to an entirely different webpage where the chance that it will end up at
page i is given by the i-th entry of v (this is to avoid being trapped
in a web-loop). So, in Googles model the bot _teleports_ itself
randomly every 6th link or so. Now, the PageRank is a
column-eigenvector for the GoogleMatrix A with eigenvalue 1 which can be
approximated by the RandomSurfer model and the rate of convergence of
this process depends on the _second_ largest eigenvalue for A
(the largest being 1). Now, in the paper posted this morning a simple
proof is given that this eigenvalue is c (because the matrix P has
multiple eigenvalues equal to 1). According to a previous paper on the
subject The
Second Eigenvalue of the Google Matrix
, this statement has
implications for the convergence rate of the standard PageRank algorithm
as the web scales, for the stability of PageRank to perturbations to the
link structure of the web, for the detection of Google spammers, and for
the design of algorithms to speed up PageRank. But I’ll have to
read more to understand the Google spammers bit…

2 Comments

the cpu 2 generation

Never
ever tell an ICT-aware person that you want to try to set up a
home-network before you understand all 65536 port-numbers and their corresponding
protocols. Here is what happened to me this week. Jan Adriaenssens returned from an extended vacation in New Zealand and I told him
about my problems with trying to set up WebDAV securely. He
stared at me with that look that teenage children have if they
find out their parents dont know how to handle the simplest things on a
mobile such as saving a number, writing an SMS let alone use the
dictionary… and asked ‘now why would you want to do that??? I just
use AppleTalk to connect to my computer securely’. Now I’m not such a
fool that I didnt try this out but I didnt manage to get matrix
mounted on my Desktop. ‘Oh, but thats probably because of the
firewall’ Jan said ‘just send an email to Peter (the guy running the
defenses here) and ask him to open up ports 548 and
427…’ And sure enough five minutes later the problem was
solved and I could trow my WebDAV-plans in the dustbin (although, I
think Ive found a use for WebDAV but will keep this a bit longer to
myself until I checked it out). If you think that was the end of it,
think twice. Never ever point an ICT-professional to your
computer. They then start looking at its firewall-logs and find all
sorts of things such as : ‘I noticed that traffic from port 53
was dropped to the firewall, could it be that you configured the
firewall as DNS-server. If this is the case, you better remove it and it
will increase your network-speed, I think.’ And sure enough that
IP-address was set on my machine as one of two possibilities for the
DNS-server so I quickly removed it and in the process thought that maybe
I should also remove the other one so I did send Peter another email
asking whether that was ok. It turned out that the second IP address was
the genuine DNS-server so I got a sec answer back ‘You better leave
this as it is otherwise not much will work…’ Oh, shame, shame eternal
shame on me!

My only defense is that I still belong
to what I would call the cpu 2 generation (I’m a few years too
old to belong to the more computer-aware generation X). When I
started out doing research in 1980 the single most important command was

cpu 2

which you had to type before you could run any program.
By typing this you asked to be given 2 minutes of central processing
time, so you had to write all your programs in such a way that either
they gave a result back within 2 minutes or to include lots of
output-commands giving you a chance to determine at which parameters you
would restart the program for your next cpu 2. I once computed in
this way all factorial maximal orders in quaternion algebras by spending
a couple of days in the computer room. These days any desktop computer
would finish this task in half a minute. Perhaps the younger generations
will appreciate all the hard computer-work we had to do back then if
they read a bit from the computer history museum page!

Leave a Comment