回产生很多输出数据。用户不得不再次循环调用,这次每次传递的是参数nil。直到filter返回了nil用户才能确定处理完毕。
大多数的filter不需要这些额外的灵活性。很幸运,这个新的filter接口也容易实现。实际上在创建行终结转换filter的时候,就已经满足要求。另一方面,如果不使用协程,串联函数会变得更加复杂。我都不想实现它。如果你知道一个简单的不使用协程实现请一定告诉我! local function chain2(f1, f2)
local co = coroutine.create(function(chunk) while true do
local filtered1 = f1(chunk) local filtered2 = f2(filtered1) local done2 = filtered1 and \ while true do
if filtered2 == \ or filtered2 == nil then break end coroutine.yield(filtered2) filtered2 = f2(done2) end
if filtered1 == \ then chunk = coroutine.yield(filtered1) elseif filtered1 == nil then return nil else chunk = chunk and \ end end end)
return function(chunk)
local _, res = coroutine.resume(co, chunk) return res end end
串联source同样变得更加复杂。一个相似的解决方式还是协程。串联sink仍然简单。有趣的是这些修改对于filter的性能没有不好的影响,无需添加额外的弹性。有人对类似gzip的filter做了严格的优化,这也是为何我们保留了它们。 9 注意事项
我们在开发Luasocket2.0过程中的提出了这些概念,实现在模块LTN12中。结果是大大简化了Luasocket的实现,还变得更加强大。MIME模块特别整合进入了LTN12,提供了很多其他的filter。我们感觉到这些概念值得公开,即使是那些不关心luasocket的人。
另一个值得一提的是“标识filter”。假设你想提供一些反馈给用户,当一个文件下载传递到一个sink的时候。把一个“标识filter”和这个sink串联(这个filter简单的返回接受到的数据不改变数据)。你能够在这期间更新进度。这个原始的sink没有被改变。另一个有趣的主意是一个T形的sink:一个sink发送数据到2个其他的sink。总的来说,好像还有足够的空间讨论其他有趣的点子。
在这篇技术文件中,我们介绍了filters, sources, sinks, and pumps。对于数据处理来说这些都是有用的工具。source为数据获取提供了简单的抽象。sink为最后
11
的数据终点提供了抽象。filter为数据转换定义了接口。filter,source,sink 的chain串联提供了优雅的方式去从简单的转换创建任意复杂的数据转换。pumps让流程运行起来。 初学者说明 Simple sources
function source() -- we have data return chunk
-- we have an error return nil, err
-- no more data return nil end
Simple sinks
function(chunk, src_err) if chunk == nil then
-- no more data to process, we won't receive more chunks if src_err then
-- source reports an error, TBD what to do with chunk received up to now else
-- do something with concatenation of chunks, all went well end
return true -- or anything that evaluates to true elseif chunk == \ then
-- this is assumed to be without effect on the sink, but may -- not be if something different than raw text is processed
-- do nothing and return true to keep filters happy return true -- or anything that evaluates to true else
-- chunk has data, process/store it as appropriate return true -- or anything that evaluates to true end
-- in case of error return nil, err end Pumps
ret, err = pump.step(source, sink)
12
if ret == 1 then
-- all ok, continue pumping elseif err then
-- an error occured in the sink or source. If in both, the sink -- error is lost.
else -- ret == nil and err == nil -- done, nothing left to pump end
ret, err = pump.all(source, sink)
if ret == 1 then -- all OK, done elseif err then
-- an error occured else
-- impossible end
Filters not expanding data function filter(chunk)
-- first two cases are to maintain chaining logic that -- support expanding filters (see below) if chunk == nil then return nil
elseif chunk == \ then return \ else
-- process chunk and return filtered data return data end end
Fancy sources
The idea of fancy sources is to enable a source to indicate which other source contains the data from now on.
function source() -- we have data return chunk
-- we have an error return nil, err
13
-- no more data return nil, nil
-- no more data in current source, but use sourceB from now on -- (\ return \, sourceB end
Transforming a fancy source in a simple one: simple = source.simplify(fancy) Fancy sinks
The idea of fancy sinks is to enable a sink to indicate which other sink processes the data from now on.
function(chunk, src_err)
-- same as above (simple sink), except next sinkK sink -- to use indicated after true
return true, sinkK end
Transforming a fancy sink in a simple one:
simple = sink.simplify(fancy) Filters expanding data
function filter(chunk) if chunk == nil then
-- end of data. If some expanded data is still to be returned return partial_data
-- the chains will keep on calling the filter to get all of -- the partial data until the filter return nil return nil
elseif chunk == \ then
-- a previous filter may have finished returning expanded -- data, now it's our turn to expand the data and return it return partial_data
-- the chains will keep on calling the filter to get all of -- the partial data until the filter return \ return \ else
-- process chunk and return filtered data, potentially partial. -- In all cases, the filter is called again with \ return partial_data end
14
end
15