2008-12-02

Teal, Truss, and NCsim: Resolved all the Random Crash Issuess

Finally, I've resolved yet another random crash issues for teal, truss, and ncsim. Now it has passed more than 100,000 verification loops without crashing. Thanks to Mike and Robert, who chose to open source teal and truss. As a result, I got a great chance to dig into the source code and learned the internal mechanism for a verification framework.

Today, I felt lucky to find another person at another corner of the world, who also chose teal and made customizations on it. Here's his post at Verification Guild:

"I am using teal right now. We picked it up about a year ago, and have done fairly extensive customization to it. I cannot, however, recommend the version we originally downloaded, it had some bugs, and I have not downloaded a newer version since then, so I can't say anything about the bug fixes. I have to say, that as far as the concepts and functionality of teal go, it is exactly what it should be. I couldn't imagine an easier and cleaner way to connect verilog to C++. I would suggest starting with it, and seeing how far it takes you. It is also free. -- Richard"

2008-11-26

Resolved a subtle teal::vout bug

I've confirmed a subtle teal::vout bug for multi-threaded truss verification environment. In my Linux box (both i686 and x86_64), the ncsim got crashed randomly. Although the crash rate is bellow 1%, it makes me not able to run regression test with teal/truss.

Here's the symptom:
./teal_vout.cpp: (set_file_and_line:631) test_component_0 thread(0x44606960) vout(0x2109718) [FILE: truss/trunk/cpp/inc/truss_test_component.h][line: 70] begin
... ... ...
./teal_vout.cpp: (set_file_and_line:631) test_component_0 thread(0x40A00960) vout(0x2109718) [FILE: ./test_component.cpp][line: 92] begin
... ... ...
=> crashed

Here's the reason:
  • Thread test_component_0 is with ID(0x44606960)
  • Thread verification_top is with ID(0x40A00960)
  • vout(0x2109718) is constructed by thread(0x40A00960) with test_component_0 as its function_area_ name
truss::verification_top calls "test_0->start()", which calls "lls::test_component_[0]->start()", which calls "truss::thread.start()" to create a new thread(0x44606960), which calls "truss::test_component.start_()" to start a new teal::vout message to vout(0x2109718).

However, before it was finished with teal::vout::end_message_() at thread(0x44606960), a new message to the same vout(0x2109718) was issued. It was called from "truss::verification_top::test_0->wait_for_completion ()", which calls "lls::test_component_[0]->wait_for_completion ()", which calls "truss::test_component.wait_for_completion()", which calls "lls::test_component::wait_for_completion_()" at thread (0x40A00960).

As a result, the message_list_ and the other maps of teal::vout got polluted. Sometime, it will end up with a "glibc detected double free or corruption error", which crashed the simulation.

And, here's my solution:
  1. remove internal_mutex_sentry() from teal::vout::put_message()
  2. put a pthread_mutex_lock() at teal::vout::set_file_and_line()
  3. put a pthread_mutex_unlock() at teal::vout::clear_message_()
Although this decreased the simulation crash rate dramatically (from 1% to bellow 0.1%). There's still a simulation error to be resolved: "Cannot place a value with no or zero delay at read only sync".

2008-08-31

使用 C++ 建構邏輯設計的驗證平台

在一般邏輯設計的教學範例中大多是利用硬體描述語言(HDL)來實作測試程序(testbench),如下圖所示:

圖一: Testbench with Stimuli and DUT

使用這種方法有以下幾種缺點:
  1. 必需手動檢查 DUT 的輸出結果,以確認設計的正確性。
  2. HDL 不是針對實作測試程序而最佳化的語言。
  3. 不易實作大型且複雜的測試程序。

因此,在 IC 設計產業中,針對邏輯設計的自動驗證流程,便有了許多種不同的解決方案。基本上,它必具備以下功能:(如圖二所示)
  1. 隨機產生 stimuli 給 Functional Model(FM) 及 DUT (Design Under Test)。
  2. 自動比對 FM 及 DUT 的輸出結果 (Results Checker)。
  3. 當比對結果不一致時,記錄當時的組態及錯誤情形,以重製該項錯誤及除錯。
  4. 能夠有效率地設計並處理複雜的驗證程序。

圖二: Basic Verification Framework

現在,我打算試用 Teal/Truss ,它定義了一套完整的邏輯驗證架構,透過 PLI/VPI ,它可以使用 C++ 建構驗證程序,以驗證使用 Verilog 製作的設計;並且它是開放源碼的自由軟體,我們可以免費取得並使用它。