Understanding how 2022 CPU works as noob

CPU design is pretty complex in modern world (M1, Intel big little, AMD chiplet) and looks nothing like what I learned in computing architecture back when I was young. In the past few months I read some good articles and wanted to write some of my understanding down.

Read More

電機轉軟體工作一年的心得

不知不覺就在小公司待了滿一年,相比同屆去上市公司/外商我算是異類中的異類。雖然工作就是工作也沒什麼好拿出來說的,但是我覺得在大家都在說跳外商或豬屎屋( IC design house ) 多好的時候,我就想出來亂一下。

Read More

Spp in gans

Recently a paper named : Characterizing signal propagation to close the performance gap in unnormalized ResNets caught my eyes because Brock et al suggest by ploting the signal propogation plots (SPP) you can debug the possible bugs in your networks. Since GANs study has largely focus in stabilizing the training process (although recent years theres been a shift to more stable method such as denoising diffussion, VAE methods ), the possible of using SPP as a probe tools to debug GANs is still quite attractive.

Read More

Debugging deny hosts whitelist

For several months I am having issues connecting from my lab IP to a remote server. For months we thought the issues was caused by the crappy network backbone where we place the server.

Read More

自然語言專有名詞對照表

說實在的因為小時候英文學不夠好,結果目前在看自然語言處理的 paper 總是遇到很多每次都要查的專有名詞(有時候自己會有股衝動去修外文系的語言學 …. )。所以有了這篇專有名詞對照表。

Read More

ICLR 2020 papers in two sentences

This is a list of papers I went through during International Conference on Learning Representations (ICLR) 2020. I want to thank ICLR organizer for selecting me as a volunteer.

Read More

How not to do Google Stadia

Google’s Stadia service, an cloud gaming service quickly destined to fail when consumer of premium package release receive their package but cannot use the service due to not receiving the activation code. With Google previously bad refund reputation [ 1, 2, 3]. Aside from the activation code arrival issue (likely because google wants to control the compute resource usage), there’s still some lesson we should learn from their mistake.

Read More

將 Postgresql 資料庫備份結合

最近自己的資料庫占滿雲端的容量讓我開始焦慮起來。但因為有些資料基本上不會被用戶讀取。但是為了日後分析的完整性卻又不能直接刪除,因此我購置了硬碟打算在我自己的主機將所有遠端伺服器上太舊的資料轉移到我自己的本地主機。

Read More

(Note) Simple Evolutionary Optimization Can Rival Stochastic Gradient Descent in Neural Networks (LEEA)

This paper introduce some insight about why a gradient optimization algo such as SGD works so well and how evolution algo (EA) can perform as well as gradient based algo.

EA learning method is as follows:

  1. A population of individual of the models are generated and a fitness score is evaluated on all individuals.

  2. Only top N individuals are selected and sexual reproduction and asexual reproduction is used to generate a new generation

  3. repeat step 1,2 until convergence is reached.

SGD has always touted that it will find local optimal solution, however, in a ANN model, there’s lots of weight configuration that finding a local optimal solution should not result in state of the art result as seen today. The argument given is that there’s many optimal path to escape from optimal solution such that the final result isn’t a local optima. They however suggest that that SGD the real culprit should be saddle point problem which other gradient based algo such as RMSProp, Adagrad etc aimed to solve.

Read More

小心David Publishing學術欺詐

今天查看信息時,收到來自自稱 David Publishing 的公司,希望能將我之前發表的一篇論文發表在他們的期刊上。乍看之下很有趣,但是細看之後發現電郵內容用語非常奇怪。例如這句 “we wish to become your friends if we may.” 應該只有華文文化才會說的(一般上正式來信不會用 friends )。

Read More

我推薦的3個英文 Podcasts 頻道

播客(台灣和香港[1]直接稱為「Podcasting」)是一種數碼媒體,指一系列的音訊、影片、電子電台或文字檔以列表形式經互聯網發佈,然後聽眾經由電子裝置訂閱該列表以下載或串流當中的電子檔案

  • Wikipedia 播客
Read More

Byte Pair Encoding, 平衡語料詞典大小與編碼資訊

在自然語言的文字處理步驟中,第一步就是將文字轉為某種數字表達式。舉例詞向量(word embedding)來說,我們就以一個字典將每個詞對應到一個詞向量上。然而常常在真實世界中總是會出現不在你字典中的新詞彙,這時候只能以一個表達未知的符號來代替該字。另外語言的詞彙太多,如果將所有詞彙對應到一個詞向量模型將會變得太大無法在一般的電腦上運行(3百萬詞彙的 FastText 就需要6GB 的記憶體)。

Read More

Summarizing MT-DNN the state of the art model in language tasks before XLNet.

Multi-Task deep neural network (MT-DNN) for deep language understanding proposed by Microsoft achieve SOTA results in April on 10 language tasks ( GLUE, SNLI, … ). The author purpose that multi task learning which used in computer vision domain before to achieve state of the art result, can also be used to improve language understanding scores.

Read More

自然語言從古至今 Part 2

第二場的演講比較屬於分享會的形式,內容比較零散,而我只能盡力的整理內容。我覺得重點是要提出一些比較新的概念可以作為大家以後研究方向的入門,讓大家自己去摸索。畢竟要把所有概念講完可能要蠻久的。

Read More

類神經網路自回歸密度估計 neural autogressive density estimator

在機器學習領域中如果想要找出數據的機率分布密度函數 ( probability density function PDF ),一般上用的是 Auto encoder, restricted boltzman machine (RBM ) , GMM。然而另一個比較少人知道的方法是類神經網路自回歸密度估計 ( NADE ) 就是利用自回歸來找出數據的 PDF。NADE 在實驗中證明比RBM, Auto encoder 來的更加優秀,尤其超越了伯努力分佈一直非常有效 RBM 。

Read More

用Alpha Zero完成期末作業結論

反正故事大概是這樣:交大人工智慧概論期末是一個人工智慧遊戲競賽,比的是一種從來沒有玩過的遊戲。為了贏得比賽我跟隊友直接訓練 Alpha Zero 類神經網路,然而中途殺出新規則限制硬體規格與不能使用套件(numpy 除外)的規定,讓我們中途放棄改用魔改 蒙地卡羅樹狀搜尋(MCTS)來繳交期末作業。

Read More

Targeted Dropout 更有效的 Dropout 機制

Dropout 作為廣泛被使用來防止類神經網路過度擬合 (ovefitting) 的一種機制,在許多大型的類神經模型都會被用到。一般的 dropout 在訓練過程中,隨機將部分的類神經網路屏蔽掉,迫使參數變少後的模型能學習到目標任務。

Read More

FRACTALNET 不適用殘差連接完成超深的類神經網路

這年頭時不時都來一個 skip connection(跳躍連接/殘差連接), 認為此風不可長的研究員就想出了 FractalNet。基於Skip connection 解決vanishing gradient 問題的理念,Fractal Net 使用梯子設計形式,增加信息可以流向的pipeline 。由於每個 pipline 所經過的權重比較少,因此 gradient 可以透過比較沒有什麼阻抗的 pipeline 更新比較底層的權重。

Read More

微框架實測評比

大學生涯以來,自己累積了一些web 端的 side projects。 但是考量開發、維護與運算成本,使用萬能的Django 也許不是最佳的方法。因為Django 的的架構複雜性太高,維護成本也相對的高。使用在重要性低又簡單的 side project 實在不太適合 ( 90% 的專案不就是做 CRUD )。

Read More

Ansible Introduction

Recently I got my hands on a brand new orange pi. However, I found myself stuck in typing the same boring command across all sessions. Hence, I believe this serves as a good opportunity to anisible for better automation across all servers.

Read More

Make Django Great Again Part 3 Sass Compiler

Sass 是一種CSS 的進階寫法,具有巢狀迴圈、變數、運算、函數、可繼承(Mixins)的語法。但是本身卻無法直接被瀏覽器解讀,因此需要借助“編譯器” 轉換成CSS 才能使用。

Read More

Make Django Development Great Again Part 2 Live Reload

之前在PyCharm 上開發Django有一個插件能在更改任何的文件以後,瀏覽器自動更新的功能。然而,這個功能僅限Chrome (因為只有支援Chrome extension ….) 。這讓習慣使用Safari 的Responsive Mode( cmd + option + R) 的我來說,不怎麼方便啊…..

Read More