2008-11-26

Resolved a subtle teal::vout bug

I've confirmed a subtle teal::vout bug for multi-threaded truss verification environment. In my Linux box (both i686 and x86_64), the ncsim got crashed randomly. Although the crash rate is bellow 1%, it makes me not able to run regression test with teal/truss.

Here's the symptom:
./teal_vout.cpp: (set_file_and_line:631) test_component_0 thread(0x44606960) vout(0x2109718) [FILE: truss/trunk/cpp/inc/truss_test_component.h][line: 70] begin
... ... ...
./teal_vout.cpp: (set_file_and_line:631) test_component_0 thread(0x40A00960) vout(0x2109718) [FILE: ./test_component.cpp][line: 92] begin
... ... ...
=> crashed

Here's the reason:
  • Thread test_component_0 is with ID(0x44606960)
  • Thread verification_top is with ID(0x40A00960)
  • vout(0x2109718) is constructed by thread(0x40A00960) with test_component_0 as its function_area_ name
truss::verification_top calls "test_0->start()", which calls "lls::test_component_[0]->start()", which calls "truss::thread.start()" to create a new thread(0x44606960), which calls "truss::test_component.start_()" to start a new teal::vout message to vout(0x2109718).

However, before it was finished with teal::vout::end_message_() at thread(0x44606960), a new message to the same vout(0x2109718) was issued. It was called from "truss::verification_top::test_0->wait_for_completion ()", which calls "lls::test_component_[0]->wait_for_completion ()", which calls "truss::test_component.wait_for_completion()", which calls "lls::test_component::wait_for_completion_()" at thread (0x40A00960).

As a result, the message_list_ and the other maps of teal::vout got polluted. Sometime, it will end up with a "glibc detected double free or corruption error", which crashed the simulation.

And, here's my solution:
  1. remove internal_mutex_sentry() from teal::vout::put_message()
  2. put a pthread_mutex_lock() at teal::vout::set_file_and_line()
  3. put a pthread_mutex_unlock() at teal::vout::clear_message_()
Although this decreased the simulation crash rate dramatically (from 1% to bellow 0.1%). There's still a simulation error to be resolved: "Cannot place a value with no or zero delay at read only sync".

2008-08-31

使用 C++ 建構邏輯設計的驗證平台

在一般邏輯設計的教學範例中大多是利用硬體描述語言(HDL)來實作測試程序(testbench),如下圖所示:

圖一: Testbench with Stimuli and DUT

使用這種方法有以下幾種缺點:
  1. 必需手動檢查 DUT 的輸出結果,以確認設計的正確性。
  2. HDL 不是針對實作測試程序而最佳化的語言。
  3. 不易實作大型且複雜的測試程序。

因此,在 IC 設計產業中,針對邏輯設計的自動驗證流程,便有了許多種不同的解決方案。基本上,它必具備以下功能:(如圖二所示)
  1. 隨機產生 stimuli 給 Functional Model(FM) 及 DUT (Design Under Test)。
  2. 自動比對 FM 及 DUT 的輸出結果 (Results Checker)。
  3. 當比對結果不一致時,記錄當時的組態及錯誤情形,以重製該項錯誤及除錯。
  4. 能夠有效率地設計並處理複雜的驗證程序。

圖二: Basic Verification Framework

現在,我打算試用 Teal/Truss ,它定義了一套完整的邏輯驗證架構,透過 PLI/VPI ,它可以使用 C++ 建構驗證程序,以驗證使用 Verilog 製作的設計;並且它是開放源碼的自由軟體,我們可以免費取得並使用它。