Hex, Bugs and More Physics | Emre S. Tasci

Octave Functions for Information Gain (Mutual Information)

April 7, 2008 Posted by Emre S. Tasci

Here is some set of functions freshly coded in GNU Octave that deal with Information Gain and related quantities.

For theoretical background and mathematica correspondence of the functions refer to : A Crash Course on Information Gain & some other DataMining terms and How To Get There (my 21/11/2007dated post)

function [t r p] = est_members(x)

Finds the distinct members of given vector x returns 3 vectors r, t and p where :

r : ordered unique member list
t : number of occurences of each element in r
p : probability of each element in r

Example :

octave:234> x
x =
1 2 3 3 2 1 1

octave:235> [r t p] = est_members(x)
r =
   1   2   3
t =
   3   2   2
p =
   0.42857   0.28571   0.28571

function C = est_scent(z,x,y,y_n)

Calculates the specific conditional entropy H(X|Y=y_n)

It is assumed that:
X is stored in the x. row of z
Y is stored in the y. row of z

Example :

octave:236> y
y =
1 2 1 1 2 3 2

octave:237> z=[x;y]
z =
1 2 3 3 2 1 1
1 2 1 1 2 3 2

octave:238> est_scent(z,1,2,2)

ans = 0.63651

function cent = est_cent(z,x,y)

Calculates the conditional entropy H(X|Y)
It is assumed that:
X is stored in the x. row of z
Y is stored in the y. row of z

Example :

octave:239> est_cent(z,1,2)
ans = 0.54558

function ig = est_ig(z,x,y)

Calculates the Information Gain (Mutual Information) IG(X|Y) = H(X) – H(X|Y)
It is assumed that:
X is stored in the x. row of z
Y is stored in the y. row of z

Example :

octave:240> est_ig(z,1,2)
ans = 0.53341

function ig = est_ig2(x,y)
Calculates the Information Gain IG(X|Y) = H(X) – H(X|Y)

Example :

octave:186> est_ig2(x,y)
ans = 0.53341

function ig = est_ig2n(x,y)
Calculates the Normalized Information Gain
IG(X|Y) = [H(X) – H(X|Y)]/min(H(X),H(Y))

Example :

octave:187> est_ig2n(x,y)
ans = 0.53116

function ent = est_entropy(p)

Calculates the entropy for the given probabilities vector p :
H(P) = – SUM(p_i * Log(p_i))

Example :

octave:241> p
p =
0.42857 0.28571 0.28571

octave:242> est_entropy(p)
ans = 1.0790

function ent = est_entropy_from_values(x)

Calculates the entropy for the given values vector x:

H(P) = -Sum(p(a_i) * Log(p(a_i))

where {a} is the set of possible values for x.

Example :

octave:243> x
x =
1 2 3 3 2 1 1

octave:244> est_entropy_from_values(x)
ans = 1.0790

Supplementary function files:

function [t r p] = est_members(x)
% Finds the distinct members of x
% r : ordered unique member list
% t : number of occurences of each element in r
% p : probability of each element in r
% Emre S. Tasci 7/4/2008
    t = unique(x);
    l = 0;
    for i = t
        l++;
        r(l) = length(find(x==i));
    endfor
    N = length(x);
    p = r/N;
endfunction

function C = est_scent(z,x,y,y_n)
% Calculates the specific conditional entropy H(X|Y=y_n)
% It is assumed that:
%   X is stored in the x. row of z
%   Y is stored in the y. row of z
%   y_n is located in the list of possible values of y
%       (i.e.   [r t p] = est_members(z(y,:))
%               y_n = r(n)
%   Emre S. Tasci 7/4/2008
[r t p] = est_members(z(x,:)(z(y,:)==y_n));
C = est_entropy(p);
endfunction

function cent = est_cent(z,x,y)
% Calculates the conditional entropy H(X|Y)
% It is assumed that:
%   X is stored in the x. row of z
%   Y is stored in the y. row of z
% Emre S. Tasci 7/4/2008
cent = 0;
j = 0;
[r t p] = est_members(z(y,:));
for i=r
    j++;
    cent += p(j)*est_scent(z,x,y,i);
endfor
endfunction

function ig = est_ig(z,x,y)
% Calculates the Information Gain IG(X|Y) = H(X) – H(X|Y)
% X is stored in the x. row of z
% Y is stored in the y. row of z
% Emre S. Tasci 7/4/2008
[r t p] = est_members(z(x,:));
ig = est_entropy(p) – est_cent(z,x,y);
endfunction

function ig = est_ig2(x,y)
% Calculates the Information Gain IG(X|Y) = H(X) – H(X|Y)
% Emre S. Tasci <e.tasci@tudelft.nl> 8/4/2008
z = [x;y];
[r t p] = est_members(z(1,:));
ig = est_entropy(p) – est_cent(z,1,2);
endfunction

function ig = est_ig2n(x,y)
% Calculates the Normalized Information Gain
% IG(X|Y) = [H(X) – H(X|Y)]/min(H(X),H(Y))
% Emre S. Tasci <e.tasci@tudelft.nl> 8/4/2008
z = [x;y];
[r t p] = est_members(z(1,:));
entx = est_entropy(p);
enty = est_entropy_from_values(y);
minent = min(entx,enty);
ig = (entx – est_cent(z,1,2))/minent;
endfunction

function ent = est_entropy(p)
% Calculates the Entropy of the given probability vector X:
%       H(X) = – Sigma(X*Log(x))
%   If you want to directly calculate the entropy from values array
%       use est_entropy_from_values(X) function
%
% Emre S. Tasci 7/4/2008
ent = 0;
    for i = p
        ent += -i*log(i);
    endfor
endfunction

function ent = est_entropy_from_values(x)
% Calculates the Entropy of the given set of values X:
%       H(X) = – Sigma(p(X_i)*Log(p(X_i)))
%   If you want to calculate the entropy from probabilities array
%       use est_entropy(X) function
%
% Emre S. Tasci 7/4/2008
ent = 0;
[r t p] = est_members(x);
    for i = p
        ent += -i*log(i);
    endfor
endfunction

Filed in Coding, Octave | No Comments »

Less is More MySQL

January 9, 2008 Posted by Emre S. Tasci

(-Right now, I’m going to define a new category: the MySQL category!-)

Suppose that, you have a database which contains a huge number of entries about the materials (like, for instance, the Pauling Database). Let it have 163 different properties we can query about. It is optimized for queries, so, the values are enumerated and the labels for these are kept in "pauling.dblxxx" tables, which may be something like this for the "Chemical System" property:

The values are kept in "pauling.valxxx" tables:

(Where the first one is the EntryCode, the PRIMARY key that relates all the tables and the second value is the enumeration for the value. For example, 1422 for the 13. property is actually Ho-Ir 🙂

and there is one more set of tables, the paulingv2.valxxx tables which are stored and keyed in val order, so:

if we want to find the EntryCodes corresponding to a given property, we query the paulingv2.valxxx tables
if we want to find the property that is corresponding to a given EntryCode we query the pauling.valxxx tables
if we want to "translate" the property’s enumeration, we query the pauling.dblxxx tables.

Here is the thing: Let’s say that we want the Structure Types that have an Atomic Enviroment Type (AET) of a rhombic dodecahedron , with a/b ratio 1, alpha=beta=90^o. The property numbers for these are:

Structure Type : 32
AET : 86
a/b ratio : 44
alpha : 41
beta : 42

Since the last three properties are numeric, we don’t enumerate them and the enumeration corresponding to the rhombic dodecahedron for AET is 6. So, fasten your seat belts, we are about to lift off! :

SELECT id,pauling.dbl032.val FROM pauling.dbl032
  INNER JOIN
  (
   SELECT DISTINCT val FROM val032
   INNER JOIN
   (
   SELECT v AS EntryCode FROM
   (
    (
     SELECT val001.val AS v FROM val001
     INNER JOIN paulingv2.val086
     USING (EntryCode)
     WHERE paulingv2.val086.val=6
    ) AS A
    INNER JOIN
    (
     (
      SELECT val001.val AS v FROM val001
      INNER JOIN paulingv2.val044
      USING (EntryCode)
      WHERE paulingv2.val044.val=1
     ) AS B
     INNER JOIN
     (
      # 41=90 && 42= 90
      (
       SELECT val001.val AS v FROM val001
       INNER JOIN paulingv2.val041
       USING (EntryCode)
       WHERE paulingv2.val041.val=90
      ) AS C
      INNER JOIN
      (
       SELECT val001.val AS v FROM val001
       INNER JOIN paulingv2.val042
       USING (EntryCode)
       WHERE paulingv2.val042.val=90
      ) AS D
      USING (v)
     )
     USING (v)
    )
    USING (v)
   )
   ) AS G
   USING (EntryCode)
  )
  AS Q ON (id = Q.val) ORDER BY val;

Later addition: Assuming 86=6; we don’t really need the other constraints 44=1, 42=90, 41=90 – do we?… So it’s just the boring:

SELECT id FROM pauling.dbl032
  INNER JOIN
  (
   SELECT DISTINCT val FROM val032
   INNER JOIN
   (
   SELECT v AS EntryCode FROM
   (
    (
     SELECT val001.val AS v FROM val001
     INNER JOIN paulingv2.val086
     USING (EntryCode)
     WHERE paulingv2.val086.val=6
    ) AS A
   )
   ) AS G
   USING (EntryCode)
  )
  AS Q ON (id = Q.val) ORDER BY Q.val;

The result of this is something like this:

To be honest, it is actually something like this :

The strange symbols are the price we pay for using non-standard charsets! 😉

So, to tidy up, I import this to a table with the following structure:

CREATE TABLE IF NOT EXISTS `bcc` (
`id` smallint(5) unsigned NOT NULL default ‘0’,
`val` varchar(254) NOT NULL default ”,
`usagecount` smallint(5) unsigned NOT NULL default ‘0’,
`val1` varchar(30) NOT NULL default ‘0’,
`val2` varchar(6) default NULL,
`val3` varchar(3) default NULL,
`val1t` varchar(30) NOT NULL,
`val2sp` smallint(5) unsigned NOT NULL,
`SG` tinyint(3) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `val` (`val`),
KEY `val1` (`val1`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

You have already met the id and val columns. "usagecount" will be imported from the pauling.dbl013 table; val1, val2, val3 are the seperated structure type information (ie, for "CuTi,tp4,129", val1="CuTi", val2="tp4" and val3="129"); val1t is the "translated" version of val1 which is the readeable one (ie, "(Ag¡•¦§Zn¡•¥¥)©Zn" is translated as "(Ag0.56Zn0.44)8Zn"). The translation is done via the following simpe php function:

function translate_symbols($string,$f_tr2symb=true)
{
$symbol_array = array("•","¡","¢","£","¤","¥","¦","§","¨","©","ª");//subscript values
$transl_array = array(".","0","1","2","3","4","5","6","7","8","9");//subscript values

if(!$f_tr2symb)return str_replace($transl_array,$symbol_array,$string);
else return str_replace($symbol_array,$transl_array,$string);
}

"val2sp" is the sliced numeric value from the Pearson Symbol stored in the val2 column.

To tidy up:

$query = "SELECT id, val1, val2 FROM bcc";
$qresult = mysql_query($query);

while($result = mysql_fetch_array($qresult))
{
$query2 = "UPDATE bcc SET val1t=\"".translate_symbols($result["val1"])."\", val2sp=SUBSTRING(val2,3,20) WHERE id = ".$result["id"]." LIMIT 1";
//echo $query2."\n";
$q2result = mysql_query($query2);
echo mysql_error();
}

You can refer to my previous entry for slicing up the "val" column. I had already done this while constructing the pauling.dblxxx tables, so in fact the actual view of the pauling.dbl032 table is the following one

meaning, I can just import them using the id’s of my new table:

UPDATE bcc, pauling.dbl032 SET
bcc.val = pauling.dbl032.val,
bcc.usagecount = pauling.dbl032.usagecount,
bcc.val1 = pauling.dbl032.val1,
bcc.val2 = pauling.dbl032.val2,
bcc.val3 = pauling.dbl032.val3
WHERE bcc.id = pauling.dbl032.id

Now we have something like:

We still have some work to do. Let’s take the two structures Ni₂Al and Ni₂In. Say that we are looking for superstructures, which have the property that the "atoms occupy atomic positions according to the parent crystal structure". The AET for Ni₂Al is given as 14-b;14-b;14-b; whereas the AET for Ni₂In is given as 11-a;11-a;14-b; . We want every atom to have the parent crystal structure (14-b – the rhombic dodecahedron for bcc), so we will eliminate those ones that have other AET.

mysql_query("USE pauling");
$ids_q = mysql_query("SELECT id FROM s07pt.bcc");
while($ids = mysql_fetch_row($ids_q))
{
$id= $ids[0];
$query = "
SELECT COUNT(DISTINCT val) FROM pauling.val032
INNER JOIN
(
         SELECT v AS EntryCode FROM
          (
                 SELECT pauling.val001.val AS v FROM pauling.val001
                 INNER JOIN paulingv2.val032
                 USING (EntryCode)
                 WHERE paulingv2.val032.val = ".$id."
          ) AS A
         INNER JOIN
          (
                 SELECT pauling.val001.val AS v FROM pauling.val001
                 INNER JOIN paulingv2.val086
                 USING (EntryCode)
                 WHERE paulingv2.val086.val != 6
          ) AS B
USING (v)
) as C USING (EntryCode)";

$query = mysql_query($query);
$result = mysql_fetch_row($query);
$result = mysql_result($query,0);
$result = ($result+1)%2;
$query = "UPDATE s07pt.bcc SET incl_theo = $result where id = $id LIMIT 1";
$j++;
echo $j.".\t".$query."\n";
mysql_query($query);
}

"incl_theo" is the column which is equal to 1 if the structure in question contains no AET other than 14-b, 0 otherwise. This gives us smt. like:

Filed in Coding, MySQL, php | No Comments »

Boasting? I guess so… 8)

December 21, 2007 Posted by Emre S. Tasci

Suppose that you’ve collected some data from the output of a program. Let’s say that some part of this data consists of Author names something similar to:

You want to split the initials from the surnames. This is piece of cake with PHP but I don’t want to go parsing each row of which there are many… So, take a look at this ugly beauty:

UPDATE dbl004 set val1 = IF(LOCATE(".",val),TRIM(SUBSTRING(SUBSTRING_INDEX(val,".",1),1, LENGTH(SUBSTRING_INDEX(val,".",1)) – LENGTH(SUBSTRING_INDEX(SUBSTRING_INDEX(val,".",1)," ",-1)))),val), val2 = IF(LOCATE(".",val), TRIM(SUBSTRING(val,LENGTH(SUBSTRING_INDEX(val,".",1)) – LENGTH(SUBSTRING_INDEX(SUBSTRING_INDEX(val,".",1)," ",-1)))),"");

aaaaand here is what you get:

if you are thinking something similar to

UPDATE dbl004 SET val1 = LEFT(val,LOCATE(" ",val)-1), val2 = RIGHT(val,LENGTH(val)-LOCATE(" ",val));

UPDATE dbl004 set val1 = TRIM(SUBSTRING(SUBSTRING_INDEX(val,".",1),1,LENGTH(SUBSTRING_INDEX(val,".",1)) – LENGTH(SUBSTRING_INDEX(SUBSTRING_INDEX(val,".",1)," ",-1)))), val2 = TRIM(SUBSTRING(val, LENGTH(SUBSTRING_INDEX(val,".",1)) – LENGTH(SUBSTRING_INDEX(SUBSTRING_INDEX(val,".",1)," ",-1))));

Try to process these 3 values: "van der Graaf K.L. Jr.", "Not Available" and "Editor".

About this entry: I couldn’t refrain myself from boasting after I managed to come up with that beautiful MySQL query… sorry for that. (Yes, I know, superbia, the 7th and the most deadly…) So let me try to balance this arrogant entry of mine:

With my best regards,
Your humble blogger…

Filed in Coding, MySQL | 4 Comments »

SAGE: Open Source Mathematics Software

December 9, 2007 Posted by Emre S. Tasci

I don’t know how it is but it offers pretty much and does this in the free spirit so I’m oughta give this software a try. Will inform you when I cover some bases…

Filed in Coding | No Comments »

Some “trivial” derivations

December 4, 2007 Posted by Emre S. Tasci

Information Theory, Inference, and Learning Algorithms by David MacKay, Exercise 22.5:

A random variable x is assumed to have a probability distribution that is a mixture of two Gaussians,

where the two Gaussians are given the labels k = 1 and k = 2; the prior probability of the class label k is {p₁ = 1/2, p₂ = 1/2}; $Formula: % MathType!MTEF!2!1!+- % feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn % hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr % 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9 % vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x % fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaiWaaeaacq % aH8oqBdaWgaaWcbaGaam4AaaqabaaakiaawUhacaGL9baaaaa!3AFA! \[ {\left\{ {\mu _k } \right\}} \]$ are the means of the two Gaussians; and both have standard deviation sigma. For brevity, we denote these parameters by

A data set consists of N points $Formula: % MathType!MTEF!2!1!+- % feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn % hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr % 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9 % vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x % fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaiWaaeaaca % WG4bWaaSbaaSqaaiaad6gaaeqaaaGccaGL7bGaayzFaaWaa0baaSqa % aiaad6gacqGH9aqpcaaIXaaabaGaamOtaaaaaaa!3DF8! \[ \left\{ {x_n } \right\}_{n = 1}^N \]$ which are assumed to be independent samples from the distribution. Let k_n denote the unknown class label of the nth point.

Assuming that $Formula: % MathType!MTEF!2!1!+- % feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn % hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr % 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9 % vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x % fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaiWaaeaacq % aH8oqBdaWgaaWcbaGaam4AaaqabaaakiaawUhacaGL9baaaaa!3AFA! \[ {\left\{ {\mu _k } \right\}} \]$ and $Formula: % MathType!MTEF!2!1!+- % feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn % hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr % 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9 % vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x % fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeq4Wdmhaaa!37B0! \[ \sigma \]$ are known, show that the posterior probability of the class label k_n of the nth point can be written as

and give expressions for $Formula: \[\omega _1 \]$ and $Formula: \[\omega _0 \]$ .

Derivation:

$Formula: % MathType!MTEF!2!1!+- % feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn % hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr % 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9 % vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x % fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGceaqabeaacaWGqb % WaaeWaaeaadaabcaqaaiaadUgadaWgaaWcbaGaamOBaaqabaGccqGH % 9aqpcaaIXaaacaGLiWoacaWG4bWaaSbaaSqaaiaad6gaaeqaaOGaai % ilaiaahI7aaiaawIcacaGLPaaacqGH9aqpdaWcaaqaamaalaaabaGa % aGymaaqaamaakaaabaGaaGOmaiabec8aWjabeo8aZnaaCaaaleqaba % GaaGOmaaaaaeqaaaaakiGacwgacaGG4bGaaiiCamaabmaabaGaeyOe % I0YaaSaaaeaadaqadaqaaiaadIhadaWgaaWcbaGaamOBaaqabaGccq % GHsislcqaH8oqBdaWgaaWcbaGaaGymaaqabaaakiaawIcacaGLPaaa % daahaaWcbeqaaiaaikdaaaaakeaacaaIYaGaeq4Wdm3aaWbaaSqabe % aacaaIYaaaaaaaaOGaayjkaiaawMcaaiaadcfadaqadaqaaiaadUga % daWgaaWcbaGaamOBaaqabaGccqGH9aqpcaaIXaaacaGLOaGaayzkaa % aabaWaaSaaaeaacaaIXaaabaWaaOaaaeaacaaIYaGaeqiWdaNaeq4W % dm3aaWbaaSqabeaacaaIYaaaaaqabaaaaOGaciyzaiaacIhacaGGWb % WaaeWaaeaacqGHsisldaWcaaqaamaabmaabaGaamiEamaaBaaaleaa % caWGUbaabeaakiabgkHiTiabeY7aTnaaBaaaleaacaaIXaaabeaaaO % GaayjkaiaawMcaamaaCaaaleqabaGaaGOmaaaaaOqaaiaaikdacqaH % dpWCdaahaaWcbeqaaiaaikdaaaaaaaGccaGLOaGaayzkaaGaamiuam % aabmaabaGaam4AamaaBaaaleaacaWGUbaabeaakiabg2da9iaaigda % aiaawIcacaGLPaaacqGHRaWkdaWcaaqaaiaaigdaaeaadaGcaaqaai % aaikdacqaHapaCcqaHdpWCdaahaaWcbeqaaiaaikdaaaaabeaaaaGc % ciGGLbGaaiiEaiaacchadaqadaqaaiabgkHiTmaalaaabaWaaeWaae % aacaWG4bWaaSbaaSqaaiaad6gaaeqaaOGaeyOeI0IaeqiVd02aaSba % aSqaaiaaikdaaeqaaaGccaGLOaGaayzkaaWaaWbaaSqabeaacaaIYa % aaaaGcbaGaaGOmaiabeo8aZnaaCaaaleqabaGaaGOmaaaaaaaakiaa % wIcacaGLPaaacaWGqbWaaeWaaeaacaWGRbWaaSbaaSqaaiaad6gaae % qaaOGaeyypa0JaaGOmaaGaayjkaiaawMcaaaaaaeaacqGH9aqpdaWc % aaqaaiaaigdaaeaacaaIXaGaey4kaSIaciyzaiaacIhacaGGWbWaae % WaaeaacqGHsisldaWcaaqaamaabmaabaGaamiEamaaBaaaleaacaWG % UbaabeaakiabgkHiTiabeY7aTnaaBaaaleaacaaIYaaabeaaaOGaay % jkaiaawMcaamaaCaaaleqabaGaaGOmaaaaaOqaaiaaikdacqaHdpWC % daahaaWcbeqaaiaaikdaaaaaaOGaey4kaSYaaSaaaeaadaqadaqaai % aadIhadaWgaaWcbaGaamOBaaqabaGccqGHsislcqaH8oqBdaWgaaWc % baGaaGymaaqabaaakiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaa % aakeaacaaIYaGaeq4Wdm3aaWbaaSqabeaacaaIYaaaaaaaaOGaayjk % aiaawMcaamaabmaabaWaaSaaaeaacaaIXaGaeyOeI0Iaamiuamaabm % aabaGaam4AamaaBaaaleaacaWGUbaabeaakiabg2da9iaaigdaaiaa % wIcacaGLPaaaaeaacaWGqbWaaeWaaeaacaWGRbWaaSbaaSqaaiaad6 % gaaeqaaOGaeyypa0JaaGymaaGaayjkaiaawMcaaaaaaiaawIcacaGL % Paaaaaaaaaa!CC7A! \[ \begin{gathered} P\left( {\left. {k_n = 1} \right|x_n ,{\mathbf{\theta }}} \right) = \frac{{\frac{1} {{\sqrt {2\pi \sigma ^2 } }}\exp \left( { - \frac{{\left( {x_n - \mu _1 } \right)^2 }} {{2\sigma ^2 }}} \right)P\left( {k_n = 1} \right)}} {{\frac{1} {{\sqrt {2\pi \sigma ^2 } }}\exp \left( { - \frac{{\left( {x_n - \mu _1 } \right)^2 }} {{2\sigma ^2 }}} \right)P\left( {k_n = 1} \right) + \frac{1} {{\sqrt {2\pi \sigma ^2 } }}\exp \left( { - \frac{{\left( {x_n - \mu _2 } \right)^2 }} {{2\sigma ^2 }}} \right)P\left( {k_n = 2} \right)}} \hfill \\ = \frac{1} {{1 + \exp \left( { - \frac{{\left( {x_n - \mu _2 } \right)^2 }} {{2\sigma ^2 }} + \frac{{\left( {x_n - \mu _1 } \right)^2 }} {{2\sigma ^2 }}} \right)\left( {\frac{{1 - P\left( {k_n = 1} \right)}} {{P\left( {k_n = 1} \right)}}} \right)}} \hfill \\ \end{gathered} \]$

Assume now that the means $Formula: % MathType!MTEF!2!1!+- % feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn % hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr % 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9 % vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x % fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaiWaaeaacq % aH8oqBdaWgaaWcbaGaam4AaaqabaaakiaawUhacaGL9baaaaa!3AFA! \[ {\left\{ {\mu _k } \right\}} \]$ are not known, and that we wish to infer them from the data $Formula: % MathType!MTEF!2!1!+- % feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn % hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr % 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9 % vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x % fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaiWaaeaaca % WG4bWaaSbaaSqaaiaad6gaaeqaaaGccaGL7bGaayzFaaWaa0baaSqa % aiaad6gacqGH9aqpcaaIXaaabaGaamOtaaaaaaa!3DF8! \[ \left\{ {x_n } \right\}_{n = 1}^N \]$ . (The standard deviation $Formula: % MathType!MTEF!2!1!+- % feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn % hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr % 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9 % vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x % fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeq4Wdmhaaa!37B0! \[ \sigma \]$ is known.) In the remainder of this question we will derive an iterative algorithm for finding values for $Formula: % MathType!MTEF!2!1!+- % feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn % hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr % 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9 % vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x % fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaiWaaeaacq % aH8oqBdaWgaaWcbaGaam4AaaqabaaakiaawUhacaGL9baaaaa!3AFA! \[ {\left\{ {\mu _k } \right\}} \]$ that maximize the likelihood,

Let L denote the natural log of the likelihood. Show that the derivative of the log likelihood with respect to $Formula: \[{\mu _k }\]$ is given by

where $Formula: % MathType!MTEF!2!1!+- % feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn % hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr % 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9 % vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x % fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiCamaaBa % aaleaacaWGRbGaaiiFaiaad6gaaeqaaOGaeyyyIORaamiuamaabmaa % baGaam4AamaaBaaaleaacaWGUbaabeaakiabg2da9iaadUgadaabba % qaaiaadIhadaWgaaWcbaGaamOBaaqabaGccaGGSaGaaCiUdaGaay5b % SdaacaGLOaGaayzkaaaaaa!47DF! \[ p_{k|n} \equiv P\left( {k_n = k\left| {x_n ,{\mathbf{\theta }}} \right.} \right) \]$ appeared above.

Derivation:

Solution:

We will be trying to plot the function

if we designate the function

as p[x,mu] (remember that $Formula: % MathType!MTEF!2!1!+- % feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn % hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr % 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9 % vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x % fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeq4Wdmhaaa!37B0! \[ \sigma \]$ =1 and $Formula: \[\frac{1}{{\sqrt {2\pi } }} = {\text{0}}{\text{.3989422804014327}}\]$ ),

$Formula: % MathType!MTEF!2!1!+- % feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn % hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr % 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9 % vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x % fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGceaqabeaacaWGqb % WaaeWaaeaacaWG4bGaaiiFamaacmaabaGaeqiVd0gacaGL7bGaayzF % aaGaaiilaiabeo8aZbGaayjkaiaawMcaaiabg2da9maadmaabaWaaa % bCaeaadaqadaqaaiaadchadaWgaaWcbaGaam4AaaqabaGccqGH9aqp % caGGUaGaaGynaaGaayjkaiaawMcaamaalaaabaGaaGymaaqaamaaka % aabaGaaGOmaiabec8aWnaabmaabaGaeq4Wdm3aaWbaaSqabeaacaaI % YaaaaOGaeyypa0JaaGymamaaCaaaleqabaGaaGOmaaaaaOGaayjkai % aawMcaaaWcbeaaaaGcciGGLbGaaiiEaiaacchadaqadaqaaiabgkHi % TmaalaaabaWaaeWaaeaacaWG4bGaeyOeI0IaeqiVd02aaSbaaSqaai % aadUgaaeqaaaGccaGLOaGaayzkaaWaaWbaaSqabeaacaaIYaaaaaGc % baGaaGOmaiabeo8aZnaaCaaaleqabaGaaGOmaaaaaaaakiaawIcaca % GLPaaaaSqaaiaadUgacqGH9aqpcaaIXaaabaGaaGOmaaqdcqGHris5 % aaGccaGLBbGaayzxaaaabaGaeyypa0JaaiOlaiaaiwdadaqadaqaai % aahchadaWadaqaaiaadIhacaGGSaGaamyBaiaadwhacaaIXaaacaGL % BbGaayzxaaGaey4kaSIaaCiCamaadmaabaGaamiEaiaacYcacaWGTb % GaamyDaiaaikdaaiaawUfacaGLDbaaaiaawIcacaGLPaaacqGHHjIU % caWHWbGaaCiCamaadmaabaGaamiEaiaacYcacaWGTbGaamyDaiaaig % dacaGGSaGaamyBaiaadwhacaaIYaaacaGLBbGaayzxaaaaaaa!8AA0! \[ \begin{gathered} P\left( {x|\left\{ \mu \right\},\sigma } \right) = \left[ {\sum\limits_{k = 1}^2 {\left( {p_k = .5} \right)\frac{1} {{\sqrt {2\pi \left( {\sigma ^2 = 1^2 } \right)} }}\exp \left( { - \frac{{\left( {x - \mu _k } \right)^2 }} {{2\sigma ^2 }}} \right)} } \right] \hfill \\ = .5\left( {{\mathbf{p}}\left[ {x,mu1} \right] + {\mathbf{p}}\left[ {x,mu2} \right]} \right) \equiv {\mathbf{pp}}\left[ {x,mu1,mu2} \right] \hfill \\ \end{gathered} \]$

$Formula: % MathType!MTEF!2!1!+- % feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn % hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr % 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9 % vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x % fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGceaqabeaacaWGqb % WaaeWaaeaadaGadaqaaiaadIhadaWgaaWcbaGaamOBaaqabaaakiaa % wUhacaGL9baadaqhaaWcbaGaamOBaiabg2da9iaaigdaaeaacaWGob % aaaOWaaqqaaeaadaGadaqaaiabeY7aTnaaBaaaleaacaWGRbaabeaa % aOGaay5Eaiaaw2haaiaacYcacqaHdpWCaiaawEa7aaGaayjkaiaawM % caaiabg2da9maarafabaGaamiuamaabmaabaGaamiEamaaBaaaleaa % caWGUbaabeaakmaaeeaabaWaaiWaaeaacqaH8oqBdaWgaaWcbaGaam % 4AaaqabaaakiaawUhacaGL9baacaGGSaGaeq4WdmhacaGLhWoaaiaa % wIcacaGLPaaaaSqaaiaad6gaaeqaniabg+GivdaakeaacqGH9aqpda % qeqbqaaiaahchacaWHWbWaamWaaeaacaWG4bGaaiilaiaad2gacaWG % 1bGaaGymaiaacYcacaWGTbGaamyDaiaaikdaaiaawUfacaGLDbaaaS % qaaiaad6gaaeqaniabg+GivdGccqGHHjIUcaWHWbGaaCiCaiaahcha % daWadaqaaiaadIhacaGGSaGaamyBaiaadwhacaaIXaGaaiilaiaad2 % gacaWG1bGaaGOmaaGaay5waiaaw2faaaaaaa!791F! \[ \begin{gathered} P\left( {\left\{ {x_n } \right\}_{n = 1}^N \left| {\left\{ {\mu _k } \right\},\sigma } \right.} \right) = \prod\limits_n {P\left( {x_n \left| {\left\{ {\mu _k } \right\},\sigma } \right.} \right)} \hfill \\ = \prod\limits_n {{\mathbf{pp}}\left[ {x,mu1,mu2} \right]} \equiv {\mathbf{ppp}}\left[ {x,mu1,mu2} \right] \hfill \\ \end{gathered} \]$

then we have:

$Formula: % MathType!MTEF!2!1!+- % feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn % hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr % 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9 % vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x % fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGceaqabeaacaWGqb % WaaeWaaeaadaGadaqaaiaadIhadaWgaaWcbaGaamOBaaqabaaakiaa % wUhacaGL9baadaqhaaWcbaGaamOBaiabg2da9iaaigdaaeaacaWGob % aaaOWaaqqaaeaadaGadaqaaiabeY7aTnaaBaaaleaacaWGRbaabeaa % aOGaay5Eaiaaw2haaiaacYcacqaHdpWCaiaawEa7aaGaayjkaiaawM % caaiabg2da9maarafabaGaamiuamaabmaabaGaamiEamaaBaaaleaa % caWGUbaabeaakmaaeeaabaWaaiWaaeaacqaH8oqBdaWgaaWcbaGaam % 4AaaqabaaakiaawUhacaGL9baacaGGSaGaeq4WdmhacaGLhWoaaiaa % wIcacaGLPaaaaSqaaiaad6gaaeqaniabg+GivdaakeaacqGH9aqpda % qeqbqaaiaahchacaWHWbWaamWaaeaacaWG4bGaaiilaiaad2gacaWG % 1bGaaGymaiaacYcacaWGTbGaamyDaiaaikdaaiaawUfacaGLDbaaaS % qaaiaad6gaaeqaniabg+GivdGccqGHHjIUcaWHWbGaaCiCaiaahcha % daWadaqaaiaadIhacaGGSaGaamyBaiaadwhacaaIXaGaaiilaiaad2 % gacaWG1bGaaGOmaaGaay5waiaaw2faaaaaaa!791F! \[ \begin{gathered} P\left( {\left\{ {x_n } \right\}_{n = 1}^N \left| {\left\{ {\mu _k } \right\},\sigma } \right.} \right) = \prod\limits_n {P\left( {x_n \left| {\left\{ {\mu _k } \right\},\sigma } \right.} \right)} \hfill \\ = \prod\limits_n {{\mathbf{pp}}\left[ {x,mu1,mu2} \right]} \equiv {\mathbf{ppp}}\left[ {x,mu1,mu2} \right] \hfill \\ \end{gathered} \]$

And in Mathematica, these mean:

mx=Join[N[Range[0,2,2/15]],N[Range[4,6,2/15]]]
Length[mx]
ListPlot[Table[{mx[[i]],1},{i,1,32}]]

p[x_,mu_]:=0.3989422804014327` * Exp[-(mu-x)^2/2];
pp[x_,mu1_,mu2_]:=.5 (p[x,mu1]+p[x,mu2]);
ppp[xx_,mu1_,mu2_]:=Module[
{ptot=1},
For[i=1,i<=Length[xx],i++,
ppar = pp[xx[[i]],mu1,mu2];
ptot *= ppar;
(*Print[xx[[i]],"\t",ppar];*)
];
Return[ptot];
];

Plot3D[ppp[mx,mu1,mu2],{mu1,0,6},{mu2,0,6},PlotRange->{0,10^-25}];

ContourPlot[ppp[mx,mu1,mu2],{mu1,0,6},{mu2,0,6},{PlotRange->{0,10^-25},ContourLines->False,PlotPoints->250}];(*It may take a while with PlotPoints->250, so just begin with PlotPoints->25 *)

That’s all folks! (for today I guess 8) (and also, I know that I said next entry would be about the soft K-means two entries ago, but believe me, we’re coming to that, eventually 😉

Attachments: Mathematica notebook for this entry, MSWord Document (actually this one is intended for me, because in the future I may need them again)

Filed in Coding, Definition & Method, Mathematica | No Comments »

Hex, Bugs and More Physics | Emre S. Tasci

Octave Functions for Information Gain (Mutual Information)

April 7, 2008 Posted by Emre S. Tasci

Less is More MySQL

January 9, 2008 Posted by Emre S. Tasci

Boasting? I guess so… 8)

December 21, 2007 Posted by Emre S. Tasci

SAGE: Open Source Mathematics Software

December 9, 2007 Posted by Emre S. Tasci

Some “trivial” derivations

December 4, 2007 Posted by Emre S. Tasci

Categories

Archive

Blogroll

Meta

Design by