# Upgrade Debian Wheezy to Debian Jessie

In this last week, I updated my RStudio in Debian Wheezy and it turned out that it needed a more recent version of the package lib6. A reliable solution was to upgrade my system to Jessie, the current stable distribution of Debian. Its latest update, Debian 8.1, was released on 6th of June, 2015.

For this reason, I share, in this post, the steps I followed for upgrading my system keeping user configuration and the main programs I use such as R, RStudio, Matlab, Mendeley, TeXstudio and others.

0) Backup your data: This is a logical initial step before starting any change on the system.

1.1) Prepare Debian Wheezy to be upgrading: Be sure that your current system does not have any problem of dependency or wrong installed packages. You can use the following commands for that purpose:

# Login like super user and write your password
su -
aptitude update


1.2) Update the repositories list: Packages for Debian Jessie is downloaded from these repositories. One way to update this list is to modify the file /etc/apt/sources.list, I use gedit for that:

gedit /etc/apt/sources.list


In my case, I put the following repositories:

# Basic repositories

deb http://ftp.uk.debian.org/debian/ jessie main
deb-src http://ftp.uk.debian.org/debian/ jessie main

# Repositories for jessie-updates, previously known as 'volatile'

# For wifi drivers
deb http://http.debian.net/debian/ jessie main contrib non-free

# For R backports: The mirror can be modified
deb http://cran.ma.imperial.ac.uk/bin/linux/debian jessie-cran3/

# For Flash Player
deb http://ftp.us.debian.org/debian jessie contrib


Another option is to change the wheezy word by jessie word automatically with the sed function.

sed -i 's/wheezy/jessie/g' /etc/apt/sources.list


1.3) Update the packages of Debian 8.1 Jessie.

apt-get update


During the upgrade it will ask if you want to restart manually or automatically some currently running services. It is suggested to make it manually.

Furthermore, after upgrading the distribution, I had to choose the device where grub should be installed. If it this your case, you should select the /dev/sda device if your pc has only one disk (use spacebar to choose the device). Otherwise, use the following link http://askubuntu.com/questions/23418/what-do-i-select-for-grub-install-devices-after-an-update.

1.5) Finally, reboot your computer to get the Debian Jessie system and enjoy.

reboot


2) Useful links to install programs:

2.1) Software R: http://cran.r-project.org/bin/linux/debian/

2.3) Texstudio (Interface para el Editor de Textos Científicos Latex): http://packages.debian.org/jessie/texstudio

2.4) Mendeley Desktop (Gestor de bibliografías): http://www.mendeley.com/

2.5) Dropbox: https://www.dropbox.com/

# Thesis Template in Latex (UNI)

Some months ago, I finished my undergraduate thesis and I modified the ClemsonThesis project made by Andrew R. Dalton in order to customize and create the UniThesis.cls class in LaTeX as a template for undergraduate tesis at Universidad Nacional de Ingeniería (UNI). The template has the features required by UNI, but can be used by other universities students modifying their personal information. Furthermore, this project is updated in https://github.com/ and you can download and use it  through the following link:

I really suggest you to use this template if you are a UNI student and have curiosity to learn LaTeX, please do not hesitate to make me any question.

Ejemplo de Caratula Tesis Uni del proyecto UniThesis.cls

For people who really like to take courses in the MOOC Coursera, I strongly recommend to use cousera-dl to download a group of lecture resources (.ppt, .pdf, .mp4). You can download all the available resources or make a filter by section name, lecture name, format, others, However, the installation could be a little hard work for people who are not accostumed to Terminal or console, but it really worths. I hope it results useful for you as it was for me.

# Print eps figure with accent in matlab

Matlab is a powerfull software to plot images in different styles and formats. For this reason, researchers use it to make graphics to their papers. The eps format is one of the best to present it in papers or presentations. We usually add text in the image as the axis labels, titles or texts in certain positions. It can be done in image with arbitrary axis or in maps with latitude and longitude axis.

However, there are problems at exporting images as .eps format when we use accent in any kind of text that was put in the image. So, here I present a way to export eps format images using LaTex option directly in matlab.

% clear all before starting
clc, clear all, close all;
% load coast and map parameters
subplot(1,2,1)
axesm('MapProjection','pcarree',...
'FLineWidth',2.5,...
'Frame','on',...
'MLineLocation',5,...
'PLineLocation',5,...
'Grid','on',...
'MapLatLimit',[-21 -1],...
'MapLonLimit',[-88 -69],...
'MeridianLabel','on', ...
'ParallelLabel','on',...
'GAltitude',5,...
'MLabelParallel','south');plotm(lat,long)
% plot the world coastlines in regions
patchesm(lat,long,[.7 .8 .7]);
tightmap;
setm(gca,'ffacecolor',[114 172 230]/255)
% add some text in latex format
textm(-9.5,-76.5,'PERÚ','FontSize',16,'fontWeight','bold')
textm(-12.2,-81.5,'OCÉANO','FontSize',10,'fontWeight','bold')
textm(-13.2,-81.5,'PACÍFICO','FontSize',10,'fontWeight','bold')
title('LÍNEA COSTERA DE PERÚ','FontSize',14)
subplot(1,2,2)
axesm('MapProjection','pcarree',...
'FLineWidth',2.5,...
'Frame','on',...
'MLineLocation',5,...
'PLineLocation',5,...
'Grid','on',...
'MapLatLimit',[-21 -1],...
'MapLonLimit',[-88 -69],...
'MeridianLabel','on', ...
'ParallelLabel','on',...
'GAltitude',5,...
'MLabelParallel','south');plotm(lat,long)
% plot the world coastlines in regions
patchesm(lat,long,[.7 .8 .7]);
tightmap;
setm(gca,'ffacecolor',[114 172 230]/255)
% add some text in latex format
textm(-9.5,-76.5,'PER\''{U}','FontSize',16,'fontWeight','bold','interpreter','LaTex')
textm(-12.2,-81.5,'OC\''{E}ANO','FontSize',10,'fontWeight','bold','interpreter','LaTex')
textm(-13.2,-81.5,'PAC\''{I}FICO','FontSize',10,'fontWeight','bold','interpreter','LaTex')
title('L\''INEA COSTERA DE PER\''U','FontSize',14,'interpreter','LaTex')

% export the figure in eps format
print -depsc prueba


Left image was made without using latex interpreter and right image was developed with the option LaTex interpreter. As you can see, the principal key is to add the option (‘interpreter’,’Latex’) to text functions as title(),xlabel(),ylabel(),text(),textm(),and others. The image above is not the real resolution printed with matlab becouse I had to convert it to .png format in order to upload it to this post.

# Summary of Cluster Analysis Distances

Cluster analysis is one of the most useful techniques in research and applications studies in a wide range of branches. It is also consider as a data reduction technique like principal components analysis (PCA), where instead of analyzing the variables, we analyze the profiles or registers. The starting point of the cluster analysis is proximity matrix that measure the similarity between objects because this is the most important concept to build clusters.

I am not going to give a full theory of this technique, but rather I want to make some descriptions of the kind of distances that can be considered to measure the similarity or dissimilarity between objects. There are several softwares that can performance cluster analysis including Matlab and R (I mention these two because the quantity of users that they have) and we have the question of what distance I should consider to develop my analysis; principally people whom are not much familiarity with a deep theory of statistics or maths, so let’s go.

The proximity between objects could be analyze trough similarity or dissimilarity measures. A common example of similarity is the Pearson correlation coefficient while the Euclidean distance is a common dissimilarity measure.

Dissimilarity

Considering two objects $r$ and $s$, $p_{rs}$ is a dissimilarity measure if the values are greater or equal than 0, $p_{rs}=0$ when the two objects are identical and $p_{rs}=p_{sr}$.

• Euclidian Distance:
• This distance is the most common in cluster analysis because it measures the geometric distance between two points in a n-dimensional space. It means that we can see if two points are near or far away in the geometric space. There is no difference if it is apply in a centered or non-centered variable. It is well used when the analyzed variables were measured under the same scale or there are no big differences between its scales. This distance is expressed as follow:

$\displaystyle d_{rs}^2=\sum_{j=1}^p (x_{rj}-x_{sj})^2$

• Standardized Euclidian Distance:
• When the variables have different scales of measure, the euclidian distance is not a good dissimilarity index because it can be highly influenced by the variable with greatest scale. In this situation, the Standardized Euclidian Distance is a good alternative. As you see, this distance is similar to the euclidian distance but with weights to each variable.

$\displaystyle d_{rs}^2=\sum_{j=1}^p \frac{(x_{rj}-x_{sj})^2}{s_j^2}=(x_r-x_s)'D^{-1}(x_r-x_s)$

• Mahalanobis Distance:
• It takes in consideration the difference in variance between features and their covariance structure. This distance is equivalent to applying the euclidian distance to the full principal components matrix.

$\displaystyle d_{rs}^2=(x_r-x_s)'S^{-1}(x_r-x_s)$

You can note that this distance delete the covariance structure. That makes it non adequate in some occasions where the correlation is very important in the distance.

• Manhattan or City Block Metric:
• This distance is based in the sum of the absolute values of the differences among the coordinates. In this metric, a constant difference between each of the p coordinates in the amount $a$ has the same effect on total distance as changing the difference in only one coordinate by the amount $pa$. That is not true for the Euclidian distance. It happens because; for example, $5^2 + 5^2 \neq (5 + 5)^2$. Furthermore, it is much less sensitive to the presence of outliers.

$\displaystyle b_{rs}=\sum_{j=1}^p |x_{rj}-x_{sj}|$

• Minkowski Metric:
• The Minkowski metric is a more general distance that covers some of the distances presented above. When $\lambda=2$, it is the euclidian distance and is the Manhattan distance when $\lambda=1$. It is always true that $m_{rs} \leq m_{rm}+p_{ms}$.

$\displaystyle m_{rs}=\left[\sum_{j=1}^p |x_{rj}-x_{sj}|^\lambda\right]^{1/\lambda}$

Similarity

Considering two objects $r$ and $s$, $p_{rs}$ is a similarity measure if the range of values is between [0-1], $p_{rs}= 1$ when the two objects are identical and $p_{rs}=p_{sr}$.

• Cosine:
• In Multivariate Analysis, the cosine of the angle between two vectors is used as a kind of measurement of similarity. It only consider the direction of the two vectors and does not depend of the length of the vectors. This kind of measure is useful when you want to evaluate the structure of the profiles.

$\displaystyle c_{rs}=\sum_{j=1}^p x_{rj}x_{sj}/\sqrt{\sum_{j=1}^p x_{rj}^2 \sum_{j=1}^p x_{sj}^2}$

• Correlation coefficient:
• When the cosine is calculated to the centralized variable, it is known as the Pearson Correlation Coefficient.

$\displaystyle q_{rs}=\sum_{j=1}^p (x_{rj}-\overline{x}_{r.})(x_{sj}-\overline{x}_{s.})/\sqrt{\sum_{j=1}^p (x_{rj}-\overline{x}_{r.})^2 \sum_{j=1}^p (x_{sj}-\overline{x}_{s.})^2}$

For more information, you can check the book of J. D. Jobson, “Applied Multivariate Data Analysis: Volume II”. I hope this information will be useful for you.