Midi at Work

Thursday, January 3, 2019

crontab的设置

1. 修改crontab文件： sudo vim /etc/crontab
2. 修改完毕之后需要重启crontab 服务： sudo service cron restart
3. crontab执行的内容打印到log文件(用于debug)
13 15 * * * root cd /home/yourpath; sh runme.sh >> logme.log 2>&1
在/home/yourpath/生成logme.log文件，可以查看crontab运行时的输出。用于debug
4. 在crontab中设置环境变量:
crontab并不自动的使用user自己的环境变量。需要显示的设置.

Thursday, March 6, 2008

用Listings高亮代码

http://en.wikibooks.org/wiki/LaTeX/Packages/Listings
pdf

例子：

\documentclass{article}
\usepackage{listings}

\begin{document}

\begin{lstlisting}[language=C]
int main(int argc, char ** argv)
{
/* print a string "Hello world!"
printf("Hello world!\n");

return 0;
}
\end{lstlisting}

\end{document}

效果：

Friday, February 8, 2008

Comments on Rethinking the Semantic Web

Yesterday I read the article Rethinking the Semantic Web (part1, part2) by Rob mcCool.
It points out several fundamental defects underlying the Semantic Web idea and proposes an Named Entity Web as the solution. Published in the year of 2005, the issues discussed are becoming widely realized and arousing great interests.

Briefly speaking, I list the features, from the human-understandability and machine processibility's point of view, of the Semantic Web as follows.

1. all information is in (RDF) triples.
2. The meaning of triples is interpreted by ontologies that they cite.
3. In order to allow machines to process triples more cleverly, OWL, basing on the Frame logic, is introduced.

The latter two issues relate closely to Knowledge representation. And in his article, Rob mcCool discussed the defection of the Semantic Web from its KR origination. I would like to quote the statements in the paper as below:

KR uses the fundamental mathematics of Codd's theory to translate information, which human represent with natural language(, into sets of tables that use well defined schema to define what can be entered in the rows and columns).
--[comment] the originate
Because information theory removes nearly all context from information, KR represents only fact.
--[comment] Where web users are mostly interested in context related information.
Complex relationships, exceptions to rules and ideas that resist simplistic classifications pose significant design challenges to information bases.

Thus, they pose a fundamental barrier, in terms of richness of representation as well as creation and maintenance, compared to the written language that people use.

I totally agree with the author in that “New representations must be easy to translate to and from natural language” and that "any other approach ignores the representation problem, assumes that context-free facts and logical rules are sufficient, and will fail."

In part2 of the paper, the author proposed an name-entity-web(NEW). It removes classes, relations and triples from Semantic Web formats in order to provide a less ambitious version of the Semantic Web which is more feasible. In the proposal of NEW
1. the basic element an entity which can be thought of as taking a simple business-card style.
2. the entities doesn't need the consistency and formalism that ontologies work so hard to ensure
3. The entities can be created by, for example, users or manufactures, for themselves.
4. The entities are embedded in HTML files, thus is connected to its context.
5. Semantics of the entity can be clarified when necessary.
6. Problems related to consistency, semantics or trusts can be solved by current techniques like page rank, search engine and so on.

I agree with 2, 3, 5 and 6. But I am still doubting that whether entity is a better representation frame (besides the paper doesn't give enough details) than triples. As for point 4, I don't think entities being embedded into the HTML files is the only approach to connect machine readable
information with its context.

What I am trying to do is:
http://www.miv.t.u-tokyo.ac.jp/papers/yangj-WI07.pdf
http://midi.jie.yang.googlepages.com/tita

Sunday, January 27, 2008

用Minipage并排图形和表格

\begin{minipage}[htb]{.45\textwidth}
\begin{minipage}{.4\textwidth}
\centering
\includegraphics[width=100pt, bb=0 0 94.19mm 94.19mm]{fig.eps}
\makeatletter\def\@captype{figure}\makeatother\caption{\scriptsize An infobox of the Wikipedia page.}
\label{fig:myfig}
\end{minipage}
\quad
\begin{minipage}{.45\textwidth}
\centering
\tiny
\begin{tabular}{|cc|}
\hline
\multicolumn{2}{|c|}{Classes from Wikipedia structure}\\
\hline
% & Feature & Example \\
genre 70 & members 57\\
origin 48 &member role 48 \\
years active 44 & a music group 39\\
member occupations 26& album Name 18 \\
lyrics 13 &album no. 10 \\
single 10 & born in 8\\
\hline
\multicolumn{2}{|c|}{extra classes from corpus}\\
\hline
die in 7 & former Members 4 \\
alias 3 &instruments 3\\
awards 3 & label 2 \\
associate act 1 & birth name 1 \\
\hline
\end{tabular}
\makeatletter\def\@captype{table}\makeatother\caption{Relation classes.}
\label{tbl:mytbl}
\end{minipage}
\end{minipage}

Wednesday, January 16, 2008

one figure over two columns &&figure array

\begin{figure*}[tb]实现跨行
用subfigure排列子图。需要引用subfigure的包：
\usepackage{graphicx}

---------------------------------------------------------------
\begin{figure*}[tb]
\centering
\subfigure[1]{\includegraphics[width=400pt]{D:/0work/res/pic/1.eps}}
%\mbox{\hspace{0.5cm}}
\\
\subfigure[2]{\includegraphics[width=140pt]{D:/0work/res/pic/2.eps}}
\subfigure[3]{\includegraphics[width=140pt]{D:/0work/res/pic/3.eps}}
%\mbox{\hspace{0.5cm}}
\subfigure[4]{\includegraphics[width=140pt]{D:/0work/res/pic/4.eps}}
\renewcommand{\figurename}{figure}
\caption{hello} \label{hello}
\end{figure*}

Monday, January 14, 2008

latex 单行或多行公式的排版

from : http://www.boyeut.com/2007/05/equation.html

1.自动编号的单行公式环境是
\begin{equation}
…
\end{equation}

不参与自动编号的单行公式环境：
\[
…
\]

人工编号的单行公式可以使用Tex原有的行间公式标记
$$公式 \eqno 编号 $$ 将编号放在右边
$$公式 \leqno 编号 $$ 将编号放在左边

引用时候可以直接用$编号$即可。

例如，$$a^2+b^2=c^2 \eqno (**)$$
由公式($**$)即可得到结论。

一般情况下，行间公式 $$…$$也可以用\[…\]表示
但对于这种人工编号的公式，不能用\[..\]代替$$…$$.

2.单个公式很长，需要换行，但仅允许生成一个编号时，可以用split命令

\begin{equation}
\begin{split}
a &= b \\
c &= d
\end{split}
\end{equation}

注意：每行只允许出现一个“&”，使用split命令后，编号上下居中显示。

3.多行公式：

\begin{eqnarray}
左 & 中 & 右\\
左 & 中 & 右\\
…
\end{eqnarray}
该环境对多行公式每行都加自动编号，如果相对某行不加编号，可在换行之前添加命令\nonumber

如果要改变公式的自动编号，可以重设计数器初始值：
\setcounter{equation}{数}
下一个编号自动加1。

4.方程组的排版：
多个公式，每个公式自动编号。

1) gather环境
是下面align环境的一种特殊情形。
\begin{gather}
a &= b \\
c &= d \\
…
\end{gather}
>>1.如果其中某几行使用同一个编号，则需要内嵌一个split环境。
>>2.命令\notag可使当前行不编号。
2) align环境
可使几组公式并排在一起，即在同一行显示多个公式，方法是跟以前一样，使用”&”对齐。
可替代gather环境。
3) 以上几种方程组环境，无论每个公式多小，都会占满一行。使用相应的\gathered，\aligned环境，则只占据公式的实际宽度，整体作为一个特大的符号与其他符号一同处理。
这个结构还可以添加位置参数，以决定与其他符号的竖直对齐方式(b,t)。而且这种环境不再具有自动编号功能。

例如：
\begin{equation}
\left.
\begin{aligned}[b]
a &= b+c \\
d &= b+c
\end{aligned}
\right\}
\Longrightarrow
\qquad a=d
\end{equation}

这里更正参考文献中P149页的一个小错误。就是\right}应该改为\right\}.

【参考文献】
1.陈志杰等，LATEX入门与提高（第二版），高等教育出版社，2006.5

此外还有：
Latex-定理编号的引用

Latex-case环境

Latex-小细节

Latex-定理定义的排版
等

Saturday, December 29, 2007

feature selection

因为想看看random forest，找到了这本书：
Feature Extraction, Foundations and Applications。
拿了第一章，很好的感觉，放在了豆瓣里面。

相关的课程网站在这里。

Tuesday, December 4, 2007

SURVEY

See here

Friday, November 16, 2007

Notes about SVM & the Kernel Method

Reference:
[1] An Introduction to Support Vector Machines and other kernel-based learning methods
Nello Cristianini, John Shawer-Taylor
[2] 数据挖掘中的新方法-支持向量机
邓乃扬，田英杰
[3] 机器学习
Tom M. Mitchell
曾华军张银奎等译
[4] http://en.wikipedia.org/wiki/Support_vector_machine
[5] An introduction to Kernel-Based Learning Algorithms
Klaus-Robert Muller, Sebastian Mika, Gunnar Ratsch, Koji Tsuda, and Bernhard Scholkopf
----------------------------------------------------------------------------------
Issues:
1.Some background of computational learning theory
Empirical Risk Minimization (ERM)
Probably Approximately Correct Learning (PAC)
--risk function and expected risk ([2] p131)
Vapnik-Chervonenkis dimension
refer:
[1] Chapter4 Generalization Theory
[2] p161 section4.8
[3] chapter seven

*They tend to solve the problem:

*求解分类问题，也许我们会想到如下途径：首先用训练集估计概率分布，然后根据所得到的概率分布求得决策函数。但是这样做违反了利用有限数量心细解决问题的一项基本原则。这项原则就是：在求解一个问题过程中，不应把解决另一个更为一般的问题作为其中的一个步骤。当前求解的问题是分类问题。而概率分布的估计是一个更为一般的问题。事实上，在统计学中，知道了概率分布几乎可以认为知道了一切，因为根据它便能解决各种各样的问题。因此我么内部应该通过估计概率分布而求解分类问题。[2]p141

*学习算法中我们需要选择适当的假设集F。实际上这里的关键因素是假设集F的大小，或F的丰富程度，或者说F的“表达能力”。VC维是对这种“表达能力”的一种描述。
Roughly speaking, the VC dimension measures how many (training) points can be shattered (i.e., separated) for all possible labelings using functions of the class. ([5] section II)
*
2.Optimization Theory
[1] chapter 5
[2] chapter 1
The problems addressed in 1 have a similar form: the hypothesis function should be chosen to minimize (or maximize) a certain functional. Optimization theory is the branch of mathematics concerned with characterizing the solutions of classes of such problems, and developing effective algorithms for finding them.

3. kernel
*Frequently the target concept cannot be expressed as a simple linear combination of the given attributes, but in general requires that more abstract features of the data be exploited. Kernel representations offer alternative solution by projecting the data into a high dimensional feature space to increase the computational power of the linear learning machines. [1]p26

[comment:] "the advantage of using the machines in the dual representation derives from the fact that in this representation the number of tunable parameters does not depend on the number of attributes being used."
-- The number of parameters depends on the definition of the kernel. if you define a "bad" kernel, there will be many parameters as well.

**
With kernels we can compute the linear hyperplane classifier in a Hilbert space with higher dimension. We prefer linear models because:

1) there is the intuition that a "simple" (e.g., linear) function that explains most of the data is preferable to a complex one (Occam's razor). ([5] section II)

2) In practice the bound on the general expected error computed by VC dimension is often neither easily computable nor very helpful. Typical problems are that the upper bound on the expected test error might be trivial (i.e., larger than one), the VC dimension of the function class is unknown or it is infinite (in which case one would need an infinite amount of training data). ([5], section II )

For linear methods, the VC dimension can be estimated as the number of free parameters, but for most other methods (including k-nearest neighbors), accurate estimates for VC-dimension are not available.

3) for the class of hyperplanes the VC dimension itself can be bounded in terms of an other quantity, ther margin. (the margin is defined as trhe minimal distance of a sample to the decision surface) ([5] section II )

4. Existing Kernels

Polynomial (homogeneous): $k(\mathbf{x},\mathbf{x}')=(\mathbf{x} \cdot \mathbf{x'})^d$
Polynomial (inhomogeneous): $k(\mathbf{x},\mathbf{x}')=(\mathbf{x} \cdot \mathbf{x'} + 1)^d$
Radial Basis Function: $k(\mathbf{x},\mathbf{x}')=\exp(-\gamma \|\mathbf{x} - \mathbf{x'}\|^2)$ , for $γ > 0$
Gaussian Radial basis function: $k(\mathbf{x},\mathbf{x}')=\exp\left(- \frac{\|\mathbf{x} - \mathbf{x'}\|^2}{2 \sigma^2}\right)$
Sigmoid: $k(\mathbf{x},\mathbf{x}')=\tanh(\kappa \mathbf{x} \cdot \mathbf{x'}+c)$ , for some (not every) $κ > 0$ and $c <>$

([4])

In short, not the mimensionalitybut the complexity of the function class matters. , all one needs for separation is a linear hyperplane. However, it becomes rather tricky to control the feature space for large real-world problems. So even i one could control the statistical complexity of this function class, one would still run into intractability problems while executing an algorithm in this space. Fortunately, for certain feature spaces F and corresponding mapping phai, there is a highly effective trick for computing scalar products in feature spaces using kernel functions. ([5] section II)

Wednesday, October 24, 2007

Latex 公式

from http://blog.sina.com.cn/s/blog_53be544e010003hn.html
首先你要先使用宏包 ntheorem

\usepackage[amsmath,thmmarks]{ntheorem}
% 定理类环境宏包，其中 amsmath 选项
% 用来兼容 AMS LaTeX 的宏包

%=== 配合上面的ntheorem宏包产生各种定理结构,重定义一些
%正文相关标题 ===
\theoremstyle{plain}
\theoremheaderfont{\normalfont\rmfamily\CJKfamily{hei}}
\theorembodyfont{\normalfont\rm\CJKfamily{kai}} \theoremindent0em
\theoremseparator{\hspace{1em}} \theoremnumbering{arabic}
%\theoremsymbol{} %定理结束时自动添加的标志
\newtheorem{definition}{\hspace{2em}定义}[section]
%\newtheorem{definition}{\hei 定义}[section]
%!!!注意当section为中国数字时，[section]不可用！
\newtheorem{proposition}{\hspace{2em}命题}[section]
\newtheorem{property}{\hspace{2em}性质}[section]
\newtheorem{lemma}{\hspace{2em}引理}[section]
%\newtheorem{lemma}[definition]{引理}
\newtheorem{theorem}{\hspace{2em}定理}[section]
\newtheorem{axiom}{\hspace{2em}公理}[section]
\newtheorem{corollary}{\hspace{2em}推论}[section]
\newtheorem{exercise}{\hspace{2em}习题}[section]
\theoremsymbol{$\blacksquare$}
\newtheorem{example}{\hspace{2em}例}[section]
\theoremstyle{nonumberplain}
\theoremheaderfont{\CJKfamily{hei}\rmfamily}
\theorembodyfont{\normalfont \rm \CJKfamily{song}} \theoremindent0em
\theoremseparator{\hspace{1em}} \theoremsymbol{$\blacksquare$}
\newtheorem{proof}{\hspace{2em}证明}

注意：如果你使用的book,而不是article，那么你要把所有的section改为chapter.

Friday, September 7, 2007

Jensen-Shannon divergence

definition given by wikipedia

In probability theory and statistics, the Jensen-Shannon divergence is a popular method of measuring the similarity between two probability distributions.

used as a similarity measure of entities in
Chen, J. and D. Ji, et al. (2006). Relation Extraction Using Label Propagation Based Semi-Supervised Learning. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia.

Sunday, August 26, 2007

Lazy Learning

http://en.wikipedia.org/wiki/Lazy_learning

In artificial intelligence, lazy learning is a learning method in which generalization beyond the training data is delayed until a query is made to the system, as opposed to in eager learning, where the system tries to generalize the training data before receiving queries.

...

Hausdorff distance

what is Hausdorff distance ?
An introduction

Named after Felix Hausdorff (1868-1942), Hausdorff distance is the « maximum distance of a set to the nearest point in the other set »

Sunday, August 19, 2007

A talk given by Prof. Mitch Marcus on last Friday

> タイトル：Unsupervised induction of morphological structure
>
> 概要：
> We will discuss the problem of unsupervised morphological and part of
> speech (POS) acquisition in realistic settings. From studies of tagged
> corpora, we show that there is a sparse data problem in morphology,
> which raises the question of how rare forms may be learned. We then show
> that it is often the case that the base form of a word is present among
> the different inflections of a lexeme, which suggests that rare forms
> can be learned by association with a base form. We introduce new
> representations for morphological structure which express the
> morphophonological transduction behavior of these base forms, and
> present
> an algorithm to acquire these structures automatically from an unlabeled
> corpus. We apply the algorithm to a range of Indo-European languages
> including Slovene, English, and Spanish.
>

comment:
1. met the same group of people (well, I mean young researchers basically) again.
2. I asked two questions on how to deal with sparse data. Based on what I understood:
a) To prune the space by analyzing features of the data.
b) To add back ground knowledge.
3. Jin said he is the 牛魔王 in their field! Orz...

Friday, August 17, 2007

Latex中纵向排列两个子图

引入包subfigure.
\usepackage{subfigure}

使用的时候：
\begin{figure}[tb]
\centering \subfigure[subCaption_1 ]{\includegraphics[width=200pt]{1.eps}}
\label{fig:selFilter} \subfigure[subCaption_2]{\includegraphics[width=200pt]{2.eps}}
\label{fig:tripleQuery}
\caption{Query} \label{Fig:CellDropRates}
\end{figure}

Tuesday, August 7, 2007

Maxium Likelihood Estimate (MLE)

1. from wikipedia:

Maximum likelihood estimation (MLE) is a popular statistical method used to make inferences about parameters of the underlying probability distribution from a given data set. That is to say, you have a sample of data

X_{1}, \dots, X_{n} \!

and some kind of model for data, and you want to estimate parameters of the distribution.

2. from http://www.itl.nist.gov/div898/handbook/eda/section3/eda3652.htm
Maximum likelihood estimation begins with the mathematical expression known as a likelihood function of the sample data. Loosely speaking, the likelihood of a set of data is the probability of obtaining that particular set of data given the chosen probability model. This expression contains the unknown parameters. Those values of the parameter that maximize the sample likelihood are known as the maximum likelihood estimates.

the dis/advantages are discussed, as well as the software.

3. about smoothed maximum likelihood estimates.
One purpose of the smoothed estimates is too account for sparseness in counts for distributions with a lot of history by backing off to less sparse estimates.
(McDonald, R. (2005). Extracting Relations from Unstructured Text, Department of Computer and Information Science, University of Pennsylvania.)

Sunday, July 22, 2007

Conditional Random Fields (CRFs)

from
http://www.inference.phy.cam.ac.uk/hmw26/crf/

Conditional random fields (CRFs) are a probabilistic framework for labeling and segmenting structured data, such as sequences, trees and lattices. The underlying idea is that of defining a conditional probability distribution over label sequences given a particular observation sequence, rather than a joint distribution over both label and observation sequences. The primary advantage of CRFs over hidden Markov models is their conditional nature, resulting in the relaxation of the independence assumptions required by HMMs in order to ensure tractable inference. Additionally, CRFs avoid the label bias problem, a weakness exhibited by maximum entropy Markov models (MEMMs) and other conditional Markov models based on directed graphical models. CRFs outperform both MEMMs and HMMs on a number of real-world tasks in many fields, including bioinformatics, computational linguistics and speech recognition.

tutorial:
Hanna M. Wallach. Conditional Random Fields: An Introduction. Technical Report MS-CIS-04-21. Department of Computer and Information Science, University of Pennsylvania, 2004.

Tuesday, July 10, 2007

Semantics, Syntax and pragmatics

from: Semantic Tagging - Susanne Ekeklint
--------------
The difference between syntactic tagging and semantic tagging is that the categories that are used to mark the entities in the latter case are of a semantic kind. Semantics has to do with intentions and meanings. POS tagging may sometimes be considered semantic but is usually seen as syntactic tagging. By tradition there is a separation between the form side (syntax) and the content side (semantics) of a phrase and the intention of this is to make a distinction between what is being said to how it is being said. Levison (1983) describes the historical background of the terms syntax, semantics and pragmatics by referring to Charles Morries's(1971) distinctions, within the sudy of "the relations of signs", or semiotics.

Syntactics (or syntax) being the study of "the formal relation of signs to one another", semantics the study of "the relations of signs to the objects to which the signs are applicable" (their designata), and pragmatics, the study of "the relation of signs to interpreters". (Morris 1938:6, quoted in Levinson 1938:1)

Levinson says that there is a "pure study" in each one of the three ares; it is however a known fact that in practice the areas often overlaps. Semantics includes the study of syntax and pragmatics includes the study of semantics (Allwood, 1993)

--------------
Levinson Stephen C.(1983) Pragmatics. Cambridge University Press
Morris, Charles W. (1971) Writings on the General Theory of Signs. The Hague: Mouton.
Morris 1938:6

Sunday, July 1, 2007

The slide about text catgorization

Mainly about the content of the chapter 16 of the book Foundations of Statistical Natural Language Processing

Here is the slide.

Sunday, June 17, 2007

Generative Model and Disriminative Model

==
Ref: MLWiki
A generative model is one which explicitly states how the observations are assumed to have been generated. Hence, it defines the joint probability of the data and latent variables of interest.

==
See also: Generative Model from wikipedia.

==
ref: A simple comparison
of
generative model (model likelihood and prior) <-- NB..
and discriminative model (model posterior) <-- SVM

==
ref: Classify Semantic Relations in Bioscience Texts.

Generative models learn the prior probability of the class and the probability of the features given the class; they are the natural choice in cases with hidden variables (partially observed or missing data). Since labeled data is expensive to collect, these models may be useful when no labels are available. However, in this paper we test the generative models on fully observed data and show that , although not as accurate as the discriminative model, their performance is promising enough to encourage their use for the case of partially observed data.

Discriminative models learn the probability of the class given the features. When we have fully observed data and we just need to learn the mapping from features to classes(classification), a discriminative approach may be more appropriate.

It must be pointed out that the neural network (discriminative model) is much slower than the graphical models (HMM-like generative models), and requires a great deal of memory.

Wednesday, June 6, 2007

Collocations

Chap 5 of foundations of statistical natural language processing

- a collocation is an expression consisting of two or more words that correspond to some conventional say of saying things.

- Collocations are characterized by limited compositionality.
(we call a natural language expression compositional if the meaning of the expression can be predicted from the meaning of the parts.)
Collocations are note fully compositional in that there is usually an element of meaning added to the combination.
--> non-compositionality
--> non-substitutability
--> non-modifiability

-term: the word term has a different meaning in information retrieval. There it refers to both words and phrases.

- a number of approaches to finding collocations:
a)selections by frequency,
raw frequency doesn't work.
With part of speech tag patterns, one gets surprisingly good result.<-- Justeson and Katz' method. hints: a simple quantitative technique combined with a small amount of linguistic knowledge goes a longway.
works well for fixed phrases.

b)selection based on mean and variance of the distance between focal word collocating word
scenario: the distance between two words in not constant so a fixed phrase approach would not work.

collocational window (usually a window of 3 to 4 words on each side fo a word)

Mean and variance o the offsets between two words in a corpus.

c)hypothesis testing (********)

in b) we can not make for sure that the high frequency and low variance of two words can be accidental. So we are also taking into account how much data we have seen. Even if there is a remarkable pattern, we will discount it if we haven't seen enough data to be certain that it couldn't be due to chance.

-->null hypothesis.
-->t test : assume that the probabilities are approximately normally distributed.
The t test looks at the mean and variance of a sample of measurements, where the null hypothesis is that the sample is drawn from a distribution with mean miu. The test looks at the difference between the observed and expected means, scaled by the variance of the data, and tells us how likely one is to get a sample of that mean and variance ( or a more extreme mean and variance) assuming that the sample is drawn from a normal distribution with mean miu.
(todo)

--> Chi-square test
The essence of the test is to compare the observed frequencies in the table with the frequencies expected for independence. If the difference between observed and expected frequencies is large, then we can eject the null hypothesis of independence.
d)mutual information

--> likelihood ratios:
more appropriate for sparse data than the chi-square test

Controlled Language

What are Controlled Natural Languages?
"Controlled Natural Languages are subsets of natural languages whose grammars and dictionaries have been restricted in order to reduce or eliminate both ambiguity and complexity. Traditionally, controlled languages fall into two major categories: those that improve readability for human readers, particularly non-native speakers, and those that improve computational processing of the text."

[link]

Friday, May 18, 2007

用fancyverb package 定义自己的verbatim格式

例：

Sunday, May 6, 2007

latex公式

http://en.wikipedia.org/wiki/Help:Formula

更改latex 中item的样子

\renewcommand{\labelitemi}{$\star$}

e.g.change the "dot" to "-"

\renewcommand{\labelitemi}{-}
\begin{itemize}
\item aaaaaa
\item bbbbbbb
\end{itemize}
\end{definition}

Examples of Math

from Information Retrieval in Folksonomies: Search and Ranking
Andreas Hotho, Robert Jaschke, Christoph Schmitz, Gerd Stumme

Saturday, May 5, 2007

reading

recent publications about tagging and the SW

Thursday, May 3, 2007

Latex并列子图中使用minipage

\begin{figure}[htb]
\centering \mbox{
\subfigure[title for sub figure 1]{
\begin{minipage}[c]{3.7cm}
\small
a sentence\\
a sentence\\
a sentence\\
a sentence\\
\end{minipage}
\label{fig:labelOfSubFigureA}
}\quad

\subfigure[title for sub figure b]{
\begin{minipage}[c]{3.7cm}
\small
a sentence\\
a sentence\\
a sentence\\
a sentence\\
a sentence\\
\end{minipage}
\label{fig:tripleTagsB} }
} \caption{Triple tags}
\label{fig:label for sub figure B}
\end{figure}

--------------------------------------
e.g.

Wednesday, May 2, 2007

latex中输入定义

\documentclass[times, 10pt,twocolumn]{article}
....
\usepackage{amsmath,amsthm,amssymb} % 引入 AMS 數學環境g
....
\newtheorem{definition}{Definition}
\newtheorem{theorem}{Theorem}(for 公理)

\begin{document}
...
(如果只是definition，之用引入amssymb包)
\begin{definition}[Name of the Definition]
content of the definition
\end{definition}

....

Monday, April 30, 2007

About RDF query

1. http://www.w3.org/2001/11/13-RDF-Query-Rules/
This document provides a survey of RDF query language and implementations and describes their capabilities in terms described in RDF Query and Rules Framework.

2.SPARQL: A query platform for Web 2.0 and the Semantic Web

A full list of different query language implementations can bee seen at http://www.w3.org/2001/11/13-RDF-Query-Rules/. But these languages lack both a common syntax and a common semantics. In fact, the existing query languages cover a significant semantic range: from declarative, SQL-like languages, to path languages, to rule or production-like systems. And SPARQL had to fill this gap.

Wednesday, April 25, 2007

vocabulary

1. generative v.s. discriminative
2. inductive v.s. recursive

Monday, April 9, 2007

about Kernel method

tutorial on kernel method

Bernhard Schölkopf. Statistical learning and kernel methods. MSR-TR 2000-23, Microsoft Research, 2000.

Kernel methods retain te original representation of objects and use the object in algorithms only via computing a kernel function between a pair of objects. A kernel function is a similarity function satisfying certain properties. More precisely, a kernel function K over the object space X is binary function K:X*X ->[0, infinite] mapping a pair of objects x,y \in X to their similarity score K(x,y). A kernel function is required to be symmetric and positive-semidefinite.

Monday, April 2, 2007

NOTE: enable curl extension from PHP

出处在这里

版本：php5.05
@已经内置有php_curl.dll,在ext目录下,此DLL用于支持SSL和zlib.
@在php.ini中找到有extension=php_curl.dll, 去掉前面的注释.
@设置extension_dir=c:\php\ext,
@将:libeay32.dll, ssleay32.dll, php5ts.dll, php_curl.dll 都拷贝到system32目录下 (或将这些文件的目录加入系统的path变量中)
@重启apache/IIS即可.

----------------------------------
查看curl的版本：

＜? php
print_r(curl_version());
?>

Thursday, March 29, 2007

Conference Information

NLP,
conference from the linguist list

Saturday, March 24, 2007

Thursday, March 22, 2007

T-FaNT

记录一下url.每次找很麻烦。

上周参加了东大的一个forum
T-FaNT 07 (Tokyo Forum on Advanced NLP and TM)

参加的很多大牛……
而内容上，比起一般的conference，更专注于对于问题本身的讨论和认识。受益良多。

Saturday, March 17, 2007

为什么会出现link2005 (zz)

原文在这里。

msdn也是很有用的。

Thursday, March 8, 2007

Dependency Parsing

Ref:
A Fundamental Algorithm for Dependency Parsing

Notes of Content:

1. phrase-structure (constituency) parser v.s. dependency parsing.
2. constituency grammar v.s. dependency grammar.
3. Dependency tree:

if two words are connected by a dependency relation:they are head and dependent, connected by the link

in the dependency tree, constituents (phrases) still exist.

4. year of 1965, it is proved that dependency grammar and constituency grammar are strongly equivalent - that they can generate the same sentences and make the same structural claims about them - provided the constituency grammar is restricted in a particular way.

Wednesday, March 7, 2007

Introduction to Syntactic Parsing

Introduction to Syntactic Parsing
A good introduction for the beginners.

by Roxana Girju
Novermber 18, 2004

Tuesday, March 6, 2007

Five basic sentence patterns in English

First of all, thanks to Jin for his help
Reference 1
Reference 2
Reference 3

-----------------------
From reference 2.
"
Now, I'm sure you know that there are five basic sentence patterns, consisting of necessay elements which are S(subject), V(verb), O(object), and C(complement).

I list out the 5 patterns here:

S+V
S+V+C
S+V+O
S+V+O+C
S+V+O+O
"

Wednesday, February 28, 2007

Good starting point

A new start, although comes a little bit late. But so much better than never.

Today I talked with professor and was told I *can* decide to work on the topic he assigned or the one I have been wanting to do for so long.

But the bad news is, seems it's difficult for a foreigner with little knowledge of Japanese to find a research position in Japan...wow....

Saturday, February 17, 2007

DONE:-)

http://midi.jie.yang.googlepages.com/courses

Friday, February 16, 2007

latex中插入并列子图

使用graphicx宏包和subfigure宏包。下面是一个例子。

\begin{figure}[htb]
\centering
\mbox{
\subfigure[信元丢失率与加速因子之间的关系]{\includegraphics[scale=1.0]{temp1.e
ps}}\quad
\subfigure[信元丢失率与缓冲大小之间的关系]{\includegraphics[scale=1.0]{temp2.e
ps}}
}
\caption{影响信元丢失率的因素}
\label{Fig:CellDropRates}
\end{figure}

Thursday, February 15, 2007

harmonic series and Riemann Zeta Function

>>Harmonic series: sigma (1/n) n = 0 .. infinit

>>Riemann Zeta Function

the most common form of Riemann Zeta Function:

Monday, January 29, 2007

inf (glb) and sup

from mathforum

inf means "infimum," or "greatest lower bound." This is
slightly different from minimum in that the greatest lower bound is
defined as:

x is the infimum of the set S [in symbols, x = inf (S)] iff:

a) x is less than or equal to all elements of S
b) there is no other number larger than x which is less than or equal
to all elements of S.

Basically, (a) means that x is a lower bound of S, and (b) means that
x is greater than all other lower bounds of S.

This differs from min (S) in that min (S) has to be a member of S.
Suppose that S = {2, 1.1, 1.01, 1.001, 1.0001, 1.00001, ...}. This
set has no smallest member, no minimum. However, it's trivial to show
that 1 is its infimum; clearly all elements are greater than or equal
to 1, and if we thought that something greater than 1 was a lower
bound, it'd be easy to show some member of S which is less than it.

So that's the difference between inf and min. It's worth noting that
every set has an inf (assuming minus infinity is okay), and that the
two concepts are the same for finite sets.

glb is another way of writing inf (sort for "greatest lower bound")

------------------------
sup and lub, which are short for "supremum" and "least upper bound."

Saturday, January 20, 2007

Latent Semantic Index (LSI)

1) here is a great explaination to a layman.

or latent semantic analysis

"The term ’semantics’ is applied to the science and study of meaning in language, and the meaning of characters, character strings and words. Not just the language and words themselves, but the true meaning being conveyed in the context in which they are being used.

In 2002 a company called Applied Semantics, an innovator in the use of semantics in text processing, launched a program known as AdSense, which was a form of contextual advertising whereby adverts were placed on website pages which contained text that was relevant to the subject of the adverts.

The matching up of text and adverts was carried out by software in the form of mathematical formulae known as algorithms. It was claimed that these formulae used semantics to analyze the meaning of the text within the web page. In fact, what it initially seemed to do was to match keywords within the page with keywords used in the adverts, though some further interpretation of meaning was evident in the way that some relevant adverts were correctly placed without containing the same keyword character string as used on the web page.

"

2)from Patterns in Unstructured Data -Discovery, Aggregation, and Visualization

Wednesday, December 27, 2006

model-theoretic semantics

>>"The basic of model-theoretic semantics can be roughly described as the following.
For a formal language L, a model M consists of descriptions about
objects and their factual relations in a domain. The descriptions are written in
another language Lm, which is a meta-language, and can either be a natural
language, like English, or another formal language. An interpretation I maps
the words in L onto the objects and relations in M. According to this theory,
the meaning of a word in L is defined as its image in M under I, and whether
a statement in L is true is determined by whether it is mapped by I onto a
fact in M."

from "Experience-Grounded Semantics:A theory for intelligent systems"
P4

>>According to model-theoretic semantics, for any formal language L, the necessary
and sufficient condition for its terms to have meaning and for its statements to have truth value is the existence of a model. In different models, the meaning of terms and truth value of statements may change; however, these changes are not caused by using the language. A reasoning system R that processes sentences in L does not depend on the semantics of L when the system runs. That means, on the one hand, that R needs no access to the meanings of terms and truth values of statements — it can distinguish terms only by their forms, and derive statements from other statements only according to its (syntactically defined) inference rules, but it puts little constraint on how the language can be interpreted. On the other hand, what knowledge R has and what operations R performs have no influence on the meaning and truth
value of the terms and sentences involved. Such a treatment is desired in pure mathematics.

Tuesday, December 26, 2006

syntax && semantic relation.

the work of NLP V.S. semantic theory.

from
http://www.cogsci.uni-osnabrueck.de/~gkatz/Semantics/SlidesPDF/Lecture%20Two.pdf
-------------------------
Syntax is an independent filed of study concerned with determining what the grammatical structures are

semantic theory interprets those structures.

-------------------------

What is semantic? - beginning

"semantic web, semantic web, semantic web..."
"Stop, may I ask a question? what do you mean by 'semantic'?"

I hope one day I can answer this question clearly.
So from now on I will write down related pieces I met, the points I understood. And discussions are warmly welcomed. If you are interested in.

I will create a new label with its name "WhatIsSemantic", so everyone (of course, mainly me, I guess) can check all the posts easily. Thus, the titles of the future posts will not necessarily be "What is semantic(#)".

Sunday, December 24, 2006

NOTE: Semantic Theory

lecture
1.introduction to semantic theory

2.semantic theory