第二十章：使用 Haskell 進行系統編程 · Real World Haskell 中文版

# 第二十章：使用 Haskell 進行系統編程目前為止，我們討論的大多數是高階概念。 Haskell 也可以用于底層系統編程。完全可以使用 Haskell 編寫使用操作系統底層接口的程序。本章中，我們將嘗試一些很有野心的東西：編寫一種類似 Perl 實際上是合法的 Haskell 的“語言”，完全使用 Haskell 實現，用于簡化編寫 shell 腳本。我們將實現管道，簡單命令調用，和一些簡單的工具用于執行由 grep 和 sed 處理的任務。有些模塊是依賴操作系統的。本章中，我們將盡可能使用不依賴特殊操作系統的通用模塊。不過，本章將有很多內容著眼于 POSIX 環境。 POSIX 是一種類 Unix 標準，如 Linux ，FreeBSD ，MacOS X ，或 Solaris 。Windows 默認情況下不支持 POSIX ，但是 Cygwin 環境為 Windows 提供了 POSIX 兼容層。 ## 調用外部程序 Haskell 可以調用外部命令。為了這么做，我們建議使用 System.Cmd 模塊中的 rawSystem 。其用特定的參數調用特定的程序，并將返回程序的退出狀態碼。你可以在 ghci 中練習一下。 ~~~ ghci> :module System.Cmd ghci> rawSystem "ls" ["-l", "/usr"] Loading package old-locale-1.0.0.0 ... linking ... done. Loading package old-time-1.0.0.0 ... linking ... done. Loading package filepath-1.1.0.0 ... linking ... done. Loading package directory-1.0.0.0 ... linking ... done. Loading package unix-2.3.0.0 ... linking ... done. Loading package process-1.0.0.0 ... linking ... done. total 124 drwxr-xr-x 2 root root 49152 2008-08-18 11:04 bin drwxr-xr-x 2 root root 4096 2008-03-09 05:53 games drwxr-sr-x 10 jimb guile 4096 2006-02-04 09:13 guile drwxr-xr-x 47 root root 8192 2008-08-08 08:18 include drwxr-xr-x 107 root root 32768 2008-08-18 11:04 lib lrwxrwxrwx 1 root root 3 2007-09-24 16:55 lib64 -> lib drwxrwsr-x 17 root staff 4096 2008-06-24 17:35 local drwxr-xr-x 2 root root 8192 2008-08-18 11:03 sbin drwxr-xr-x 181 root root 8192 2008-08-12 10:11 share drwxrwsr-x 2 root src 4096 2007-04-10 16:28 src drwxr-xr-x 3 root root 4096 2008-07-04 19:03 X11R6 ExitSuccess ~~~ 此處，我們相當于執行了 shell 命令 ls-l/usr 。 rawSystem 并不從字符串解析輸入參數或是擴展通配符 [[43]](#) 。取而代之，其接受一個包含所有參數的列表。如果不想提供參數，可以像這樣簡單地輸入一個空列表。 ~~~ ghci> rawSystem "ls" [] calendartime.ghci modtime.ghci rp.ghci RunProcessSimple.hs cmd.ghci posixtime.hs rps.ghci timediff.ghci dir.ghci rawSystem.ghci RunProcess.hs time.ghci ExitSuccess ~~~ ## 目錄和文件信息 System.Directory 模塊包含了相當多可以從文件系統獲取信息的函數。你可以獲取某目錄包含的文件列表，重命名或刪除文件，復制文件，改變當前工作路徑，或者建立新目錄。 System.Directory 是可移植的，在可以跑 GHC 的平臺都可以使用。 [System.Directory 的庫文檔](http://hackage.haskell.org/package/directory-1.0.0.0/docs/System-Directory.html) [http://hackage.haskell.org/package/directory-1.0.0.0/docs/System-Directory.html] 中含有一份詳盡的函數列表。讓我們通過 ghci 來對其中一些進行演示。這些函數大多數簡單的等價于其對應的 C 語言庫函數或 shell 命令。 ~~~ ghci> :module System.Directory ghci> setCurrentDirectory "/etc" Loading package old-locale-1.0.0.0 ... linking ... done. Loading package old-time-1.0.0.0 ... linking ... done. Loading package filepath-1.1.0.0 ... linking ... done. Loading package directory-1.0.0.0 ... linking ... done. ghci> getCurrentDirectory "/etc" ghci> setCurrentDirectory ".." ghci> getCurrentDirectory "/" ~~~ 此處我們看到了改變工作目錄和獲取當前工作目錄的命令。它們類似 POSIX shell 中的 cd 和 pwd 命令。 ~~~ ghci> getDirectoryContents "/" [".","..","lost+found","boot","etc","media","initrd.img","var","usr","bin","dev","home","lib","mnt","proc","root","sbin","tmp","sys","lib64","srv","opt","initrd","vmlinuz",".rnd","www","ultra60","emul",".fonts.cache-1","selinux","razor-agent.log",".svn","initrd.img.old","vmlinuz.old","ugid-survey.bulkdata","ugid-survey.brief"] ~~~ getDirectoryContents 返回一個列表，包含給定目錄的所有內容。注意，在 POSIX 系統中，這個列表通常包含特殊值 ”.” 和 ”..” 。通常在處理目錄內容時，你可能會希望將他們過濾出去，像這樣： ~~~ ghci> getDirectoryContents "/" >>= return . filter (`notElem` [".", ".."]) ["lost+found","boot","etc","media","initrd.img","var","usr","bin","dev","home","lib","mnt","proc","root","sbin","tmp","sys","lib64","srv","opt","initrd","vmlinuz",".rnd","www","ultra60","emul",".fonts.cache-1","selinux","razor-agent.log",".svn","initrd.img.old","vmlinuz.old","ugid-survey.bulkdata","ugid-survey.brief"] ~~~ Tip 更細致的討論如何過濾 getDirectoryContents 函數的結果，請參考 [*第八章：高效文件處理、正則表達式、文件名匹配*](#) filter(`notElem`[".",".."]) 這段代碼是否有點莫名其妙？也可以寫作 filter(c->not$elemc[".",".."]) 。反引號讓我們更有效的將第二個參數傳給 notElem ；在 “中序函數” 一節中有關于反引號更詳細的信息。也可以向系統查詢某些路徑的位置。這將向底層操作系統發起查詢相關信息。 ~~~ ghci> getHomeDirectory "/home/bos" ghci> getAppUserDataDirectory "myApp" "/home/bos/.myApp" ghci> getUserDocumentsDirectory "/home/bos" ~~~ ## 終止程序開發者經常編寫獨立的程序以完成特定任務。這些獨立的部分可能會被組合起來完成更大的任務。一段 shell 腳本或者其他程序將會執行它們。發起調用的腳本需要獲知被調用程序是否執行成功。 Haskell 自動為異常退出的程序分配一個 “不成功” 的狀態碼。不過，你需要對狀態碼進行更細粒度的控制。可能你需要對不同類型的錯誤返回不同的代碼。 System.Exit 模塊提供一個途徑可以在程序退出時返回特定的狀態碼。通過調用 exitWithExitSuccess 表示程序執行成功（POSIX 系統中的 0）。或者可以調用 exitWith(ExitFailure5) ，表示將在程序退出時向系統返回 5 作為狀態碼。 ## 日期和時間從文件時間戳到商業事務的很多事情都涉及到日期和時間。除了從系統獲取日期時間信息之外，Haskell 提供了很多關于時間日期的操作方法。 ## ClockTime 和 CalendarTime 在 Haskell 中，日期和時間主要由 System.Time 模塊處理。它定義了兩個類型： ClockTime 和 CalendarTime 。 ClockTime 是傳統 POSIX 中時間戳的 Haskell 版本。 ClockTime 表示一個相對于 UTC 1970 年 1 月 1 日零點的時間。負值的 ClockTime 表示在其之前的秒數，正值表示在其之后的秒數。 ClockTime 便于計算。因為它遵循協調世界時（Coordinated Universal Time，UTC），其不必調整本地時區、夏令時或其他時間處理中的特例。每天是精確的 (60 * 60 * 24) 或 86,400 秒 [[44]](#)，這易于計算時間間隔。舉個例子，你可以簡單的記錄某個程序開始執行的時間和其結束的時間，相減即可確定程序的執行時間。如果需要的話，還可以除以 3600，這樣就可以按小時顯示。使用 ClockTime 的典型場景： > > - 經過了多長時間？ > - 相對此刻 14 天前是什么時間？ > - 文件的最后修改時間是何時？ > - 當下的精確時間是何時？ ClockTime 善于處理這些問題，因為它們使用無法混淆的精確時間。但是， ClockTime 不善于處理下列問題： > > - 今天是周一嗎？ > - 明年 5 月 1 日是周幾？ > - 在我的時區當前是什么時間，考慮夏令時。 CalendarTime 按人類的方式存儲時間：年，月，日，小時，分，秒，時區，夏令時信息。很容易的轉換為便于顯示的字符串，或者以上問題的答案。你可以任意轉換 ClockTime 和 CalendarTime 。Haskell 將 ClockTime 可以按本地時區轉換為 CalendarTime ，或者按 CalendarTime 格式表示的 UTC 時間。 #### 使用 ClockTime ClockTime 在 System.Time 中這樣定義： ~~~ data ClockTime = TOD Integer Integer ~~~ 第一個 Integer 表示從 Unix 紀元開始經過的秒數。第二個 Integer 表示附加的皮秒數。因為 Haskell 中的 ClockTime 使用無邊界的 Integer 類型，所以其能夠表示的數據范圍僅受計算資源限制。讓我們看看使用 ClockTime 的一些方法。首先是按系統時鐘獲取當前時間的 getClockTime 函數。 ~~~ ghci> :module System.Time ghci> getClockTime Loading package old-locale-1.0.0.0 ... linking ... done. Loading package old-time-1.0.0.0 ... linking ... done. Mon Aug 18 12:10:38 CDT 2008 ~~~ 如果一秒鐘再次運行 getClockTime ，它將返回一個更新后的時間。這條命令會輸出一個便于觀察的字符串，補全了周相關的信息。這是由于 ClockTime 的 Show 實例。讓我們從更底層看一下 ClockTime ： ~~~ ghci> TOD 1000 0 Wed Dec 31 18:16:40 CST 1969 ghci> getClockTime >>= (\(TOD sec _) -> return sec) 1219079438 ~~~ 這里我們先構建一個 ClockTime ，表示 UTC 時間 1970 年 1 月 1 日午夜后 1000 秒這個時間點。在你的時區這個時間相當于 1969 年 12 月 31 日晚。第二個例子演示如何從 getClockTime 返值中將秒數取出來。我們可以像這樣操作它： ~~~ ghci> getClockTime >>= (\(TOD sec _) -> return (TOD (sec + 86400) 0)) Tue Aug 19 12:10:38 CDT 2008 ~~~ 這將顯精確示你的時區 24 小時后的時間，因為 24 小時等于 86,400 秒。 #### 使用 CalendarTime 正如其名字暗示的， CalendarTime 按日歷上的方式表示時間。它包括年、月、日等信息。 CalendarTime 和其相關類型定義如下： ~~~ data CalendarTime = CalendarTime {ctYear :: Int, -- Year (post-Gregorian) ctMonth :: Month, ctDay :: Int, -- Day of the month (1 to 31) ctHour :: Int, -- Hour of the day (0 to 23) ctMin :: Int, -- Minutes (0 to 59) ctSec :: Int, -- Seconds (0 to 61, allowing for leap seconds) ctPicosec :: Integer, -- Picoseconds ctWDay :: Day, -- Day of the week ctYDay :: Int, -- Day of the year (0 to 364 or 365) ctTZName :: String, -- Name of timezone ctTZ :: Int, -- Variation from UTC in seconds ctIsDST :: Bool -- True if Daylight Saving Time in effect } data Month = January | February | March | April | May | June | July | August | September | October | November | December data Day = Sunday | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday ~~~ 關于以上結構有些事情需要強調： > > - ctWDay, ctYDay, ctTZName 是被創建 CalendarTime 的庫函數生成的，但是并不參與計算。如果你手工創建一個 CalendarTime ，不必向其中填寫準確的值，除非你的計算依賴于它們。 > - 這三個類型都是 Eq, Ord, Read, Show 類型類的成員。另外， Month 和 Day 都被聲明為 Enum 和 Bounded 類型類的成員。更多的信息請參考 “重要的類型類” 這一章節。有幾種不同的途徑可以生成 CalendarTime 。可以像這樣將 ClockTime 轉換為 CalendarTime ： ~~~ ghci> :module System.Time ghci> now <- getClockTime Loading package old-locale-1.0.0.0 ... linking ... done. Loading package old-time-1.0.0.0 ... linking ... done. Mon Aug 18 12:10:35 CDT 2008 ghci> nowCal <- toCalendarTime now CalendarTime {ctYear = 2008, ctMonth = August, ctDay = 18, ctHour = 12, ctMin = 10, ctSec = 35, ctPicosec = 804267000000, ctWDay = Monday, ctYDay = 230, ctTZName = "CDT", ctTZ = -18000, ctIsDST = True} ghci> let nowUTC = toUTCTime now ghci> nowCal CalendarTime {ctYear = 2008, ctMonth = August, ctDay = 18, ctHour = 12, ctMin = 10, ctSec = 35, ctPicosec = 804267000000, ctWDay = Monday, ctYDay = 230, ctTZName = "CDT", ctTZ = -18000, ctIsDST = True} ghci> nowUTC CalendarTime {ctYear = 2008, ctMonth = August, ctDay = 18, ctHour = 17, ctMin = 10, ctSec = 35, ctPicosec = 804267000000, ctWDay = Monday, ctYDay = 230, ctTZName = "UTC", ctTZ = 0, ctIsDST = False} ~~~ 用 getClockTime 從系統獲得當前的 ClockTime 。接下來， toCalendarTime 按本地時間區將 ClockTime 轉換為 CalendarTime 。 toUTCtime 執行類似的轉換，但其結果將以 UTC 時區表示。注意， toCalendarTime 是一個 IO 函數，但是 toUTCTime 不是。原因是 toCalendarTime 依賴本地時區返回不同的結果，但是針對相同的 ClockTime ， toUTCTime 將始終返回相同的結果。很容易改變一個 CalendarTime 的值 ~~~ ghci> nowCal {ctYear = 1960} CalendarTime {ctYear = 1960, ctMonth = August, ctDay = 18, ctHour = 12, ctMin = 10, ctSec = 35, ctPicosec = 804267000000, ctWDay = Monday, ctYDay = 230, ctTZName = "CDT", ctTZ = -18000, ctIsDST = True} ghci> (\(TOD sec _) -> sec) (toClockTime nowCal) 1219079435 ghci> (\(TOD sec _) -> sec) (toClockTime (nowCal {ctYear = 1960})) -295685365 ~~~ 此處，先將之前的 CalendarTime 年份修改為 1960 。然后用 toClockTime 將其初始值轉換為一個 ClockTime ，接著轉換新值，以便觀察其差別。注意新值在轉換為 ClockTime 后顯示了一個負的秒數。這是意料中的， ClockTime 表示的是 UTC 時間 1970 年 1 月 1 日午夜之后的秒數。也可以像這樣手工創建 CalendarTime ： ~~~ ghci> let newCT = CalendarTime 2010 January 15 12 30 0 0 Sunday 0 "UTC" 0 False ghci> newCT CalendarTime {ctYear = 2010, ctMonth = January, ctDay = 15, ctHour = 12, ctMin = 30, ctSec = 0, ctPicosec = 0, ctWDay = Sunday, ctYDay = 0, ctTZName = "UTC", ctTZ = 0, ctIsDST = False} ghci> (\(TOD sec _) -> sec) (toClockTime newCT) 1263558600 ~~~ 注意，盡管 2010 年 1 月 15 日并不是一個周日 – 并且也不是一年中的第 0 天 – 系統可以很好的處理這些情況。實際上，如果將其轉換為 ClockTime 后再轉回 CalendarTime ，你將發現這些域已經被正確的處理了。 ~~~ ghci> toUTCTime . toClockTime $ newCT CalendarTime {ctYear = 2010, ctMonth = January, ctDay = 15, ctHour = 12, ctMin = 30, ctSec = 0, ctPicosec = 0, ctWDay = Friday, ctYDay = 14, ctTZName = "UTC", ctTZ = 0, ctIsDST = False} ~~~ #### ClockTime 的 TimeDiff 以對人類友好的方式難于處理 ClockTime 值之間的差異， System.Time 模塊包括了一個 TimeDiff 類型。 TimeDiff 用于方便的處理這些差異。其定義如下： ~~~ data TimeDiff = TimeDiff {tdYear :: Int, tdMonth :: Int, tdDay :: Int, tdHour :: Int, tdMin :: Int, tdSec :: Int, tdPicosec :: Integer} ~~~ diffClockTimes 和 addToClockTime 兩個函數接收一個 ClockTime 和一個 TimeDiff 并在內部將 ClockTime 轉換為一個 UTC 時區的 CalendarTime ，在其上執行 TimeDiff ，最后將結果轉換回一個 ClockTime 。看看它怎樣工作： ~~~ ghci> :module System.Time ghci> let feb5 = toClockTime $ CalendarTime 2008 February 5 0 0 0 0 Sunday 0 "UTC" 0 False Loading package old-locale-1.0.0.0 ... linking ... done. Loading package old-time-1.0.0.0 ... linking ... done. ghci> feb5 Mon Feb 4 18:00:00 CST 2008 ghci> addToClockTime (TimeDiff 0 1 0 0 0 0 0) feb5 Tue Mar 4 18:00:00 CST 2008 ghci> toUTCTime $ addToClockTime (TimeDiff 0 1 0 0 0 0 0) feb5 CalendarTime {ctYear = 2008, ctMonth = March, ctDay = 5, ctHour = 0, ctMin = 0, ctSec = 0, ctPicosec = 0, ctWDay = Wednesday, ctYDay = 64, ctTZName = "UTC", ctTZ = 0, ctIsDST = False} ghci> let jan30 = toClockTime $ CalendarTime 2009 January 30 0 0 0 0 Sunday 0 "UTC" 0 False ghci> jan30 Thu Jan 29 18:00:00 CST 2009 ghci> addToClockTime (TimeDiff 0 1 0 0 0 0 0) jan30 Sun Mar 1 18:00:00 CST 2009 ghci> toUTCTime $ addToClockTime (TimeDiff 0 1 0 0 0 0 0) jan30 CalendarTime {ctYear = 2009, ctMonth = March, ctDay = 2, ctHour = 0, ctMin = 0, ctSec = 0, ctPicosec = 0, ctWDay = Monday, ctYDay = 60, ctTZName = "UTC", ctTZ = 0, ctIsDST = False} ghci> diffClockTimes jan30 feb5 TimeDiff {tdYear = 0, tdMonth = 0, tdDay = 0, tdHour = 0, tdMin = 0, tdSec = 31104000, tdPicosec = 0} ghci> normalizeTimeDiff $ diffClockTimes jan30 feb5 TimeDiff {tdYear = 0, tdMonth = 12, tdDay = 0, tdHour = 0, tdMin = 0, tdSec = 0, tdPicosec = 0} ~~~ 首先我們生成一個 ClockTime 表示 UTC 時間 2008 年 2 月 5 日。注意，若你的時區不是 UTC，按你本地時區的格式，當其被顯示的時候可能是 2 月 4 日晚。其次，我們用 addToClockTime 在其上加一個月。2008 是閏年，但系統可以正確的處理，然后我們得到了一個月后的相同日期。使用 toUTCTime ，我們可以看到以 UTC 時間表示的結果。第二個實驗，設定一個表示 UTC 時間 2009 年 1 月 30 日午夜的時間。2009 年不是閏年，所以我們可能很好奇其加上一個月是什么結果。因為 2009 年沒有 2 月 29 日和 2 月 30 日，所以我們得到了 3 月 2 日。最后，我們可以看到 diffClockTimes 怎樣通過兩個 ClockTime 值得到一個 TimeDiff ，盡管其只包含秒和皮秒。 normalizeTimeDiff 函數接受一個 TimeDiff 將其重新按照人類的習慣格式化。 ## 文件修改日期很多程序需要找出某些文件的最后修改日期。 ls 和圖形化的文件管理器是典型的需要顯示文件最后變更時間的程序。 System.Directory 模塊包含一個跨平臺的 getModificationTime 函數。其接受一個文件名，返回一個表示文件最后變更日期的 ClockTime 。例如： ~~~ ghci> :module System.Directory ghci> getModificationTime "/etc/passwd" Loading package old-locale-1.0.0.0 ... linking ... done. Loading package old-time-1.0.0.0 ... linking ... done. Loading package filepath-1.1.0.0 ... linking ... done. Loading package directory-1.0.0.0 ... linking ... done. Fri Aug 15 08:29:48 CDT 2008 ~~~ POSIX 平臺不僅維護變更時間 (被稱為 mtime)，還有最后讀或寫訪問時間 (atime)以及最后狀態變更時間 (ctime)。這是 POSIX 平臺獨有的，所以跨平臺的 System.Directory 模塊無法訪問它。取而代之，需要使用 System.Posix.Files 模塊中的函數。下面有一個例子： ~~~ -- file: ch20/posixtime.hs -- posixtime.hs import System.Posix.Files import System.Time import System.Posix.Types -- | Given a path, returns (atime, mtime, ctime) getTimes :: FilePath -> IO (ClockTime, ClockTime, ClockTime) getTimes fp = do stat <- getFileStatus fp return (toct (accessTime stat), toct (modificationTime stat), toct (statusChangeTime stat)) -- | Convert an EpochTime to a ClockTime toct :: EpochTime -> ClockTime toct et = TOD (truncate (toRational et)) 0 ~~~ 注意對 getFileStatus 的調用。這個調用直接映射到 C 語言的 stat() 函數。其返回一個包含了大量不同種類信息的值，包括文件類型、權限、屬主、組、和我們感性去的三種時間值。 System.Posix.Files 提供了 accessTime 等多個函數，可以將我們感興趣的時間從 getFileStatus 返回的 FileStatus 類型中提取出來。 > accessTime 等函數返回一個POSIX 平臺特有的類型，稱為 EpochTime ，可以通過 toct 函數轉換 ClockTime 。 System.Posix.Files 模塊同樣提供了 setFileTimes 函數，以設置文件的 atime 和 mtime 。 [[45]](#) ## 延伸的例子: 管道我們已經了解了如何調用外部程序。有時候需要更多的控制。比如獲得程序的標準輸出、提供輸入，甚至將不同的外部程序串起來調用。管道有助于實現所有這些需求。管道經常用在 shell 腳本中。在 shell 中設置一個管道，會調用多個程序。第一個程序的輸入會做為第二個程序的輸入。其輸出又會作為第三個的輸入，以此類推。最后一個程序通常將輸出打印到終端，或者寫入文件。下面是一個 POSIX shell 的例子，演示如何使用管道： ~~~ $ ls /etc | grep 'm.*ap' | tr a-z A-Z IDMAPD.CONF MAILCAP MAILCAP.ORDER MEDIAPRM TERMCAP ~~~ 這條命令運行了三個程序，使用管道在它們之間傳輸數據。它以 ls/etc 開始，輸出是 /etc 目錄下全部文件和目錄的列表。 ls 的輸出被作為 grep 的輸入。我們想 grep 輸入一條正則使其只輸出以 ‘m' 開頭并且在某處包含 “ap” 的行。最后，其結果被傳入 tr 。我們給 tr 設置一個選項，使其將所有字符轉換為大寫。 tr 的輸出沒有特殊的去處，所以直接在屏幕顯示。這種情況下，程序之間的管道線路由 shell 設置。我們可以使用 Haskell 中的 POSIX 工具實現同的事情。在講解如何實現之前，要提醒你一下， System.Posix 模塊提供的是很低階的 Unix 系統接口。無論使用何種編程語言，這些接口都可以相互組合，組合的結果也可以相互組合。這些低階接口的完整性質可以用一整本書來討論，這章中我們只會簡單介紹。 ## 使用管道做重定向 POSIX 定義了一個函數用于創建管道。這個函數返回兩個文件描述符（FD），與 Haskell 中的句柄概念類似。一個 FD 用于讀端，另一個用于寫端。任何從寫端寫入的東西，都可以從讀端讀取。這些數據就是“通過管道推送”的。在 Haskell 中，你可以通過 createPipe 使用這個接口。在外部程序之間傳遞數據之前，要做的第一步是建立一個管道。同時還要將一個程序的輸出重定向到管道，并將管道做為另一個程序的輸入。 Haskell 的 dupTo 函數就是做這個的。其接收一個 FD 并將其拷貝為另一個 FD 號。 POSIX 的標準輸入、標準輸出和標準錯誤的 FD 分別被預定義為 0, 1, 2 。將管道的某一端設置為這些 FD 號，我們就可以有效的重定向程序的輸入和輸出。不過還有問題需要解決。我們不能簡單的只是在某個調用比如 rawSystem 之前使用 dupTo ，因為這回混淆我們的 Haskell 主程序的輸入和輸出。此外， rawSystem 會一直阻塞直到被調用的程序執行完畢，這讓我們無法啟動并行執行的進程。為了解決這個問題，可以使用 forkProcess 。這是一個很特殊的函數。它實際上生成了一份當前進程的拷貝，并使這兩份進程同時運行。 Haskell 的 forkProcess 函數接收一個函數，使其在新進程（稱為子進程）中運行。我們讓這個函數調用 dupTo 。之后，其調用 executeFile 調用真正希望執行的命令。這同樣也是一個特殊的函數：如果一切順利，他將不會返回。這是因為 executeFile 使用一個不同的程序替換了當前執行的進程。最后，初始的 Haskell 進程調用 getProcessStatus 以等待子進程結束，并獲得其狀態碼。在 POSIX 系統中，無論何時你執行一條命令，不關是在命令上上敲 ls 還是在 Haskell 中使用 rawSystem ，其內部機理都是調用 forkProcess , executeFile , 和 getProcessStatusa (或是它們對應的 C 函數)。為了使用管道，我們復制了系統啟動程序的進程，并且加入了一些調用和重定向管道的步驟。還有另外一些輔助步驟需要注意。當調用 forkProcess 時，“幾乎”和程序有關的一切都被復制 [[46]](#) 。包括所有已經打開的文件描述符（句柄）。程序通過檢查管道是否傳來文件結束符判斷數據接收是否結束。寫端進程關閉管道時，讀端程序將收到文件結束符。然而，如果同一個寫端文件描述符在多個進程中同時存在，則文件結束符要在所有進程中都被關閉才會發送文件結束符。因此，我們必須在子進程中追蹤打開了哪些文件描述符，以便關閉它們。同樣，也必須盡早在主進程中關閉子進程的寫管道。下面是一個用 Haskell 編寫的管道系統的初始實現： ~~~ -- file: ch20/RunProcessSimple.hs {-# OPTIONS_GHC -XDatatypeContexts #-} {-# OPTIONS_GHC -XTypeSynonymInstances #-} {-# OPTIONS_GHC -XFlexibleInstances #-} module RunProcessSimple where --import System.Process import Control.Concurrent import Control.Concurrent.MVar import System.IO import System.Exit import Text.Regex.Posix import System.Posix.Process import System.Posix.IO import System.Posix.Types import Control.Exception {- | The type for running external commands. The first part of the tuple is the program name. The list represents the command-line parameters to pass to the command. -} type SysCommand = (String, [String]) {- | The result of running any command -} data CommandResult = CommandResult { cmdOutput :: IO String, -- ^ IO action that yields the output getExitStatus :: IO ProcessStatus -- ^ IO action that yields exit result } {- | The type for handling global lists of FDs to always close in the clients -} type CloseFDs = MVar [Fd] {- | Class representing anything that is a runnable command -} class CommandLike a where {- | Given the command and a String representing input, invokes the command. Returns a String representing the output of the command. -} invoke :: a -> CloseFDs -> String -> IO CommandResult -- Support for running system commands instance CommandLike SysCommand where invoke (cmd, args) closefds input = do -- Create two pipes: one to handle stdin and the other -- to handle stdout. We do not redirect stderr in this program. (stdinread, stdinwrite) <- createPipe (stdoutread, stdoutwrite) <- createPipe -- We add the parent FDs to this list because we always need -- to close them in the clients. addCloseFDs closefds [stdinwrite, stdoutread] -- Now, grab the closed FDs list and fork the child. childPID <- withMVar closefds (\fds -> forkProcess (child fds stdinread stdoutwrite)) -- Now, on the parent, close the client-side FDs. closeFd stdinread closeFd stdoutwrite -- Write the input to the command. stdinhdl <- fdToHandle stdinwrite forkIO $ do hPutStr stdinhdl input hClose stdinhdl -- Prepare to receive output from the command stdouthdl <- fdToHandle stdoutread -- Set up the function to call when ready to wait for the -- child to exit. let waitfunc = do status <- getProcessStatus True False childPID case status of Nothing -> fail $ "Error: Nothing from getProcessStatus" Just ps -> do removeCloseFDs closefds [stdinwrite, stdoutread] return ps return $ CommandResult {cmdOutput = hGetContents stdouthdl, getExitStatus = waitfunc} -- Define what happens in the child process where child closefds stdinread stdoutwrite = do -- Copy our pipes over the regular stdin/stdout FDs dupTo stdinread stdInput dupTo stdoutwrite stdOutput -- Now close the original pipe FDs closeFd stdinread closeFd stdoutwrite -- Close all the open FDs we inherited from the parent mapM_ (\fd -> catch (closeFd fd) (\(SomeException e) -> return ())) closefds -- Start the program executeFile cmd True args Nothing -- Add FDs to the list of FDs that must be closed post-fork in a child addCloseFDs :: CloseFDs -> [Fd] -> IO () addCloseFDs closefds newfds = modifyMVar_ closefds (\oldfds -> return $ oldfds ++ newfds) -- Remove FDs from the list removeCloseFDs :: CloseFDs -> [Fd] -> IO () removeCloseFDs closefds removethem = modifyMVar_ closefds (\fdlist -> return $ procfdlist fdlist removethem) where procfdlist fdlist [] = fdlist procfdlist fdlist (x:xs) = procfdlist (removefd fdlist x) xs -- We want to remove only the first occurance ot any given fd removefd [] _ = [] removefd (x:xs) fd | fd == x = xs | otherwise = x : removefd xs fd {- | Type representing a pipe. A 'PipeCommand' consists of a source and destination part, both of which must be instances of 'CommandLike'. -} data (CommandLike src, CommandLike dest) => PipeCommand src dest = PipeCommand src dest {- | A convenient function for creating a 'PipeCommand'. -} (-|-) :: (CommandLike a, CommandLike b) => a -> b -> PipeCommand a b (-|-) = PipeCommand {- | Make 'PipeCommand' runnable as a command -} instance (CommandLike a, CommandLike b) => CommandLike (PipeCommand a b) where invoke (PipeCommand src dest) closefds input = do res1 <- invoke src closefds input output1 <- cmdOutput res1 res2 <- invoke dest closefds output1 return $ CommandResult (cmdOutput res2) (getEC res1 res2) {- | Given two 'CommandResult' items, evaluate the exit codes for both and then return a "combined" exit code. This will be ExitSuccess if both exited successfully. Otherwise, it will reflect the first error encountered. -} getEC :: CommandResult -> CommandResult -> IO ProcessStatus getEC src dest = do sec <- getExitStatus src dec <- getExitStatus dest case sec of Exited ExitSuccess -> return dec x -> return x {- | Execute a 'CommandLike'. -} runIO :: CommandLike a => a -> IO () runIO cmd = do -- Initialize our closefds list closefds <- newMVar [] -- Invoke the command res <- invoke cmd closefds [] -- Process its output output <- cmdOutput res putStr output -- Wait for termination and get exit status ec <- getExitStatus res case ec of Exited ExitSuccess -> return () x -> fail $ "Exited: " ++ show x ~~~ 在研究這個函數的運作原理之前，讓我們先來在 ghci 里面嘗試運行它一下： ~~~ ghci> runIO $ ("pwd", []::[String]) /Users/Blade/sandbox ghci> runIO $ ("ls", ["/usr"]) NX X11 X11R6 bin include lib libexec local sbin share standalone ghci> runIO $ ("ls", ["/usr"]) -|- ("grep", ["^l"]) lib libexec local ghci> runIO $ ("ls", ["/etc"]) -|- ("grep", ["m.*ap"]) -|- ("tr", ["a-z", "A-Z"]) COM.APPLE.SCREENSHARING.AGENT.LAUNCHD ~~~ 我們從一個簡單的命令 pwd 開始，它會打印當前工作目錄。我們將 [] 做為參數列表，因為 pwd 不需要任何參數。由于使用了類型類， Haskell 無法自動推導出 [] 的類型，所以我們說明其類型為字符串組成的列表。下面是一個更復雜些的例子。我們執行了 ls ，將其輸出傳入 grep 。最后我們通過管道，調用了一個與本節開始處 shell 內置管道的例子中相同的命令。不像 shell 中那樣舒服，但是相對于 shell 我們的程序始終相對簡單。讓我們讀一下程序。起始處的 OPTIONS_GHC 語句，作用與 ghc 或 ghci 開始時傳入 -fglasgow-exts 參數相同。我們使用了一個 GHC 擴展，以允許使用 (String,[String]) 類型作為一個類型類的實例 [[47]](#) 。將此類聲明加入源碼文件，就不用在每次調用這個模塊的時候都要記得手工打開編譯器開關。在載入了所需模塊之后，定義了一些類型。首先，定義 typeSysCommand=(String,[String]) 作為一個別名。這是系統將接收并執行的命令的類型。例子中的每條領命都要用到這個類型的數據。 CommandResult 命令用于表示給定命令的執行結果， CloseFDs 用于表示必須在新的子進程中關閉的文件描述符列表。接著，定義一個類稱為 CommandLike 。這個類用來跑 “東西” ，這個“東西” 可以是獨立的程序，可以是兩個程序之間的管道，未來也可以跑純 Haskell 函數。任何一個類型想為這個類的成員，只需實現一個函數 – invoke 。這將允許以 runIO 啟動一個獨立命令或者一個管道。這在定義管道時也很有用，因為我們可以擁有某個管道的讀寫兩端的完整調用棧。我們的管道基礎設施將使用字符串在進程間傳遞數據。我們將通過 hGetContents 獲得 Haskell 在延遲讀取方面的優勢，并使用 forkIO 在后臺寫入。這種設計工作得不錯，盡管傳輸速度不像將兩個進程的管道讀寫端直接連接起來那樣快 [[48]](#) 。但這讓實現很簡單。我們僅需要小心，不要做任何會讓整個字符串被緩沖的操作，把接下來的工作完全交給 Haskell 的延遲特性。接下來，為 SysCommand 定義一個 CommandLike 實例。我們創建兩個管道：一個用來作為新進程的標準輸入，另一個用于其標準輸出。將產生兩個讀端兩個寫端，四個文件描述符。我們將要在子進程中關閉的文件描述符加入列表。這包括子進程標準輸入的寫端，和子進程標準輸出的讀端。接著，我們 fork 出子進程。然后可以在父進程中關閉相關的子進程文件描述符。 fork 之前不能這樣做，因為那時子進程還不可用。獲取 stdinwrite 的句柄，并通過 forkIO 啟動一個現成向其寫入數據。接著定義 waitfunc , 其中定義了調用這在準備好等待子進程結束時要執行的動作。同時，子進程使用 dupTo ，關閉其不需要的文件描述符。并執行命令。然后定義一些工具函數用來管理文件描述符。此后，定義一些工具用于建立管道。首先，定義一個新類型 PipeCommand ，其有源和目的兩個屬性。源和目的都必須是 CommandLike 的成員。為了方便，我們還定義了 -|- 操作符。然后使 PipeCommand 成為 CommandLike 的實例。它調用第一個命令并獲得輸出，將其傳入第二個命令。之后返回第二個命令的輸出，并調用 getExitStatus 函數等待命令執行結束并檢查整組命令執行之后的狀態碼。最后以定義 runIO 結束。這個函數建立了需要在子進程中關閉的 FDS 列表，執行程序，顯示輸出，并檢查其退出狀態。 ## 更好的管道上個例子中解決了一個類似 shell 的管道系統的基本需求。但是為它加上下面這些特點之后就更好了： > > - 支持更多的 shell 語法。 > - 使管道同時支持外部程序和正規 Haskell 函數，并使二者可以自由的混合使用。 > - 以易于 Haskell 程序利用的方式返回標準輸出和退出狀態碼。幸運的是，支持這些功能的代碼片段已經差不多就位了。只需要為 CommandLike 多加入幾個實例，以及一些類似 runIO 的函數。下面是修訂后實現了以上功能的例子代碼： ~~~ -- file: ch20/RunProcess.hs {-# OPTIONS_GHC -XDatatypeContexts #-} {-# OPTIONS_GHC -XTypeSynonymInstances #-} {-# OPTIONS_GHC -XFlexibleInstances #-} module RunProcess where import System.Process import Control.Concurrent import Control.Concurrent.MVar import Control.Exception import System.Posix.Directory import System.Directory(setCurrentDirectory) import System.IO import System.Exit import Text.Regex import System.Posix.Process import System.Posix.IO import System.Posix.Types import Data.List import System.Posix.Env(getEnv) {- | The type for running external commands. The first part of the tuple is the program name. The list represents the command-line parameters to pass to the command. -} type SysCommand = (String, [String]) {- | The result of running any command -} data CommandResult = CommandResult { cmdOutput :: IO String, -- ^ IO action that yields the output getExitStatus :: IO ProcessStatus -- ^ IO action that yields exit result } {- | The type for handling global lists of FDs to always close in the clients -} type CloseFDs = MVar [Fd] {- | Class representing anything that is a runnable command -} class CommandLike a where {- | Given the command and a String representing input, invokes the command. Returns a String representing the output of the command. -} invoke :: a -> CloseFDs -> String -> IO CommandResult -- Support for running system commands instance CommandLike SysCommand where invoke (cmd, args) closefds input = do -- Create two pipes: one to handle stdin and the other -- to handle stdout. We do not redirect stderr in this program. (stdinread, stdinwrite) <- createPipe (stdoutread, stdoutwrite) <- createPipe -- We add the parent FDs to this list because we always need -- to close them in the clients. addCloseFDs closefds [stdinwrite, stdoutread] -- Now, grab the closed FDs list and fork the child. childPID <- withMVar closefds (\fds -> forkProcess (child fds stdinread stdoutwrite)) -- Now, on the parent, close the client-side FDs. closeFd stdinread closeFd stdoutwrite -- Write the input to the command. stdinhdl <- fdToHandle stdinwrite forkIO $ do hPutStr stdinhdl input hClose stdinhdl -- Prepare to receive output from the command stdouthdl <- fdToHandle stdoutread -- Set up the function to call when ready to wait for the -- child to exit. let waitfunc = do status <- getProcessStatus True False childPID case status of Nothing -> fail $ "Error: Nothing from getProcessStatus" Just ps -> do removeCloseFDs closefds [stdinwrite, stdoutread] return ps return $ CommandResult {cmdOutput = hGetContents stdouthdl, getExitStatus = waitfunc} -- Define what happens in the child process where child closefds stdinread stdoutwrite = do -- Copy our pipes over the regular stdin/stdout FDs dupTo stdinread stdInput dupTo stdoutwrite stdOutput -- Now close the original pipe FDs closeFd stdinread closeFd stdoutwrite -- Close all the open FDs we inherited from the parent mapM_ (\fd -> catch (closeFd fd) (\(SomeException e) -> return ())) closefds -- Start the program executeFile cmd True args Nothing {- | An instance of 'CommandLike' for an external command. The String is passed to a shell for evaluation and invocation. -} instance CommandLike String where invoke cmd closefds input = do -- Use the shell given by the environment variable SHELL, -- if any. Otherwise, use /bin/sh esh <- getEnv "SHELL" let sh = case esh of Nothing -> "/bin/sh" Just x -> x invoke (sh, ["-c", cmd]) closefds input -- Add FDs to the list of FDs that must be closed post-fork in a child addCloseFDs :: CloseFDs -> [Fd] -> IO () addCloseFDs closefds newfds = modifyMVar_ closefds (\oldfds -> return $ oldfds ++ newfds) -- Remove FDs from the list removeCloseFDs :: CloseFDs -> [Fd] -> IO () removeCloseFDs closefds removethem = modifyMVar_ closefds (\fdlist -> return $ procfdlist fdlist removethem) where procfdlist fdlist [] = fdlist procfdlist fdlist (x:xs) = procfdlist (removefd fdlist x) xs -- We want to remove only the first occurance ot any given fd removefd [] _ = [] removefd (x:xs) fd | fd == x = xs | otherwise = x : removefd xs fd -- Support for running Haskell commands instance CommandLike (String -> IO String) where invoke func _ input = return $ CommandResult (func input) (return (Exited ExitSuccess)) -- Support pure Haskell functions by wrapping them in IO instance CommandLike (String -> String) where invoke func = invoke iofunc where iofunc :: String -> IO String iofunc = return . func -- It's also useful to operate on lines. Define support for line-based -- functions both within and without the IO monad. instance CommandLike ([String] -> IO [String]) where invoke func _ input = return $ CommandResult linedfunc (return (Exited ExitSuccess)) where linedfunc = func (lines input) >>= (return . unlines) instance CommandLike ([String] -> [String]) where invoke func = invoke (unlines . func . lines) {- | Type representing a pipe. A 'PipeCommand' consists of a source and destination part, both of which must be instances of 'CommandLike'. -} data (CommandLike src, CommandLike dest) => PipeCommand src dest = PipeCommand src dest {- | A convenient function for creating a 'PipeCommand'. -} (-|-) :: (CommandLike a, CommandLike b) => a -> b -> PipeCommand a b (-|-) = PipeCommand {- | Make 'PipeCommand' runnable as a command -} instance (CommandLike a, CommandLike b) => CommandLike (PipeCommand a b) where invoke (PipeCommand src dest) closefds input = do res1 <- invoke src closefds input output1 <- cmdOutput res1 res2 <- invoke dest closefds output1 return $ CommandResult (cmdOutput res2) (getEC res1 res2) {- | Given two 'CommandResult' items, evaluate the exit codes for both and then return a "combined" exit code. This will be ExitSuccess if both exited successfully. Otherwise, it will reflect the first error encountered. -} getEC :: CommandResult -> CommandResult -> IO ProcessStatus getEC src dest = do sec <- getExitStatus src dec <- getExitStatus dest case sec of Exited ExitSuccess -> return dec x -> return x {- | Different ways to get data from 'run'. * IO () runs, throws an exception on error, and sends stdout to stdout * IO String runs, throws an exception on error, reads stdout into a buffer, and returns it as a string. * IO [String] is same as IO String, but returns the results as lines * IO ProcessStatus runs and returns a ProcessStatus with the exit information. stdout is sent to stdout. Exceptions are not thrown. * IO (String, ProcessStatus) is like IO ProcessStatus, but also includes a description of the last command in the pipe to have an error (or the last command, if there was no error) * IO Int returns the exit code from a program directly. If a signal caused the command to be reaped, returns 128 + SIGNUM. * IO Bool returns True if the program exited normally (exit code 0, not stopped by a signal) and False otherwise. -} class RunResult a where {- | Runs a command (or pipe of commands), with results presented in any number of different ways. -} run :: (CommandLike b) => b -> a -- | Utility function for use by 'RunResult' instances setUpCommand :: CommandLike a => a -> IO CommandResult setUpCommand cmd = do -- Initialize our closefds list closefds <- newMVar [] -- Invoke the command invoke cmd closefds [] instance RunResult (IO ()) where run cmd = run cmd >>= checkResult instance RunResult (IO ProcessStatus) where run cmd = do res <- setUpCommand cmd -- Process its output output <- cmdOutput res putStr output getExitStatus res instance RunResult (IO Int) where run cmd = do rc <- run cmd case rc of Exited (ExitSuccess) -> return 0 Exited (ExitFailure x) -> return x (Terminated x _) -> return (128 + (fromIntegral x)) Stopped x -> return (128 + (fromIntegral x)) instance RunResult (IO Bool) where run cmd = do rc <- run cmd return ((rc::Int) == 0) instance RunResult (IO [String]) where run cmd = do r <- run cmd return (lines r) instance RunResult (IO String) where run cmd = do res <- setUpCommand cmd output <- cmdOutput res -- Force output to be buffered evaluate (length output) ec <- getExitStatus res checkResult ec return output checkResult :: ProcessStatus -> IO () checkResult ps = case ps of Exited (ExitSuccess) -> return () x -> fail (show x) {- | A convenience function. Refers only to the version of 'run' that returns @IO ()@. This prevents you from having to cast to it all the time when you do not care about the result of 'run'. -} runIO :: CommandLike a => a -> IO () runIO = run ------------------------------------------------------------ -- Utility Functions ------------------------------------------------------------ cd :: FilePath -> IO () cd = setCurrentDirectory {- | Takes a string and sends it on as standard output. The input to this function is never read. -} echo :: String -> String -> String echo inp _ = inp -- | Search for the regexp in the lines. Return those that match. grep :: String -> [String] -> [String] grep pat = filter (ismatch regex) where regex = mkRegex pat ismatch r inp = case matchRegex r inp of Nothing -> False Just _ -> True {- | Creates the given directory. A value of 0o755 for mode would be typical. An alias for System.Posix.Directory.createDirectory. -} mkdir :: FilePath -> FileMode -> IO () mkdir = createDirectory {- | Remove duplicate lines from a file (like Unix uniq). Takes a String representing a file or output and plugs it through lines and then nub to uniqify on a line basis. -} uniq :: String -> String uniq = unlines . nub . lines -- | Count number of lines. wc -l wcL, wcW :: [String] -> [String] wcL inp = [show (genericLength inp :: Integer)] -- | Count number of words in a file (like wc -w) wcW inp = [show ((genericLength $ words $ unlines inp) :: Integer)] sortLines :: [String] -> [String] sortLines = sort -- | Count the lines in the input countLines :: String -> IO String countLines = return . (++) "\n" . show . length . lines ~~~ 主要改變是： > > - String 的 CommandLike 實例，以便在 shell 中對字符串進行求值和調用。 > - String->IOString 的實例，以及其它幾種相關類型的實現。這樣就可以像處理命令一樣處理 Haskell 函數。 > - RunResult 類型類，定義了一個 run 函數，其可以用多種不同方式返回命令的相關信息。 > - 一些工具函數，提供了用 Haskell 實現的類 Unix shell 命令。現在來試試這些新特性。首先確定一下之前例子中的命令是否還能工作。然后，使用新的類 shell 語法運行一下。 ~~~ ghci> :load RunProcess.hs [1 of 1] Compiling RunProcess ( RunProcess.hs, interpreted ) Ok, modules loaded: RunProcess. ghci> runIO $ ("ls", ["/etc"]) -|- ("grep", ["m.*ap"]) -|- ("tr", ["a-z", "A-Z"]) Loading package array-0.5.0.0 ... linking ... done. Loading package deepseq-1.3.0.2 ... linking ... done. Loading package bytestring-0.10.4.0 ... linking ... done. Loading package containers-0.5.5.1 ... linking ... done. Loading package filepath-1.3.0.2 ... linking ... done. Loading package old-locale-1.0.0.6 ... linking ... done. Loading package time-1.4.2 ... linking ... done. Loading package unix-2.7.0.1 ... linking ... done. Loading package directory-1.2.1.0 ... linking ... done. Loading package process-1.2.0.0 ... linking ... done. Loading package transformers-0.3.0.0 ... linking ... done. Loading package mtl-2.1.3.1 ... linking ... done. Loading package regex-base-0.93.2 ... linking ... done. Loading package regex-posix-0.95.2 ... linking ... done. Loading package regex-compat-0.95.1 ... linking ... done. COM.APPLE.SCREENSHARING.AGENT.LAUNCHD ghci> runIO $ "ls /etc" -|- "grep 'm.*ap'" -|- "tr a-z A-Z" COM.APPLE.SCREENSHARING.AGENT.LAUNCHD ~~~ 輸入起來容易多了。試試使用 Haskell 實現的 grep 來試一下其它的新特性： ~~~ ghci> runIO $ "ls /usr/local/bin" -|- grep "m.*ap" -|- "tr a-z A-Z" DUMPCAP MERGECAP NMAP ghci> run $ "ls /usr/local/bin" -|- grep "m.*ap" -|- "tr a-z A-Z" :: IO String "DUMPCAP\nMERGECAP\nNMAP\n" ghci> run $ "ls /usr/local/bin" -|- grep "m.*ap" -|- "tr a-z A-Z" :: IO [String] ["DUMPCAP","MERGECAP","NMAP"] ghci> run $ "ls /usr" :: IO String "X11\nX11R6\nbin\ninclude\nlib\nlibexec\nlocal\nsbin\nshare\nstandalone\ntexbin\n" ghci> run $ "ls /usr" :: IO Int X11 X11R6 bin include lib libexec local sbin share standalone texbin 0 ghci> runIO $ echo "Line1\nHi, test\n" -|- "tr a-z A-Z" -|- sortLines HI, TEST LINE1 ~~~ ## 關于管道，最后說幾句我們開發了一個精巧的系統。前面時醒過， POSIX 有時會很復雜。另外要強調一下：要始終注意確保先將這些函數返回的字符串求值，然后再嘗試獲取子進程的退出狀態碼。子進程經常要等待寫出其所有輸出之后才能退出，如果搞錯了獲取輸出和退出狀態碼的順序，你的程序會掛住。本章中，我們從零開始開發了一個精簡版的 HSH 。如果你希望使程序具有這樣類 shell 的功能，我們推薦使用 HSH 而非上面開發的例子，因為 HSH 的實現更加優化。HSH 還有一個數量龐大的工具函數集和更多功能，但其背后的代碼也更加龐大和復雜。其實例子中很多工具函數的代碼我們是直接從 HSH 抄過來的。可以從 [http://software.complete.org](http://software.complete.org)/hsh訪問 HSH 的源碼。注 | [[43]](#) | 也有一個 system 函數，接受單個字符串為參數，并將其傳入 shell 解析。我們推薦使用 rawSystem ，因為某些字符在 shell 中具有特殊含義，可能會導致安全隱患或者意外的行為。 | |-----|-----| | [[44]](#) | 可能有人會注意到 UTC 定義了不規則的閏秒。在 Haskell 使用的 POSIX 標準中，規定了在其表示的時間中，每天必須都是精確的 86,400 秒，所以在執行日常計算時無需擔心閏秒。精確的處理閏秒依賴于系統而且復雜，不過通常其可以被解釋為一個“長秒”。這個問題大體上只是在執行精確的亞秒級計算時才需要關心。 | |-----|-----| | [[45]](#) | POSIX 系統上通常無法設置 ctime 。 | |-----|-----| | [[46]](#) | 線程是一個主要例外，其不會被復制，所以說“幾乎”。 | |-----|-----| | [[47]](#) | Haskell 社區對這個擴展支持得很好。 Hugs 用戶可以通過 hugs-98+o 使用。 | |-----|-----| | [[48]](#) | Haskell 的 HSH 庫提供了與此相近的 API ，使用了更高效（也更復雜）的機構將外部進程使用管道直接連接起來，沒有要傳給 Haskell 處理的數據。shell 采用了相同的方法，而且這樣可以降低處理管道的 CPU 負載。 | |-----|-----|