Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add functionality for iteration over lists #2926

Open
jszigetvari opened this issue Sep 16, 2019 · 12 comments
Open

add functionality for iteration over lists #2926

jszigetvari opened this issue Sep 16, 2019 · 12 comments

Comments

@jszigetvari
Copy link
Contributor

Currently syslog-ng does not offer any built-in functionality that would make it possible to iterate over the elements of a list data structure.
This makes it particularly difficult to perform operations on list elements, like searching, or modification operations.
(At the moment, one has to resort to looping back messages to syslog-ng, to iterate over list elements, which is inconvenient and has a significant performance overhead.)

@lbudai
Copy link
Collaborator

lbudai commented Oct 1, 2019

@jszigetvari : could you add example configs?

@faxm0dem
Copy link
Contributor

faxm0dem commented Oct 2, 2019

Will that make it possible to iterate over messages of a grouping-by or patterndb context?

@gaborznagy
Copy link
Collaborator

@faxm0dem: based on the quick look I took, the template functions $(context-values) and $(context-lookup) do create the kind of lists we are talking about, but not grouping-by() or patterndb.

@bazsi
Copy link
Collaborator

bazsi commented Oct 2, 2019 via email

@ryanfaircloth
Copy link
Contributor

Another example here would be metrics. Presently I am doing some back flips to reformat the metrics into something that can be used down stream.

@furiel
Copy link
Collaborator

furiel commented Mar 29, 2020

I have been thinking about this topic lately.

We will need two things:

  • What do we mean by iteration?
  • How do we instruct the iteration? What should it do with the list?

The answer for the first question is easy: map, filter, reduce.

The second part is tricky. The most intuitive way would be to pass a template function to the loop constructs. The difficulty from implementation point of view is that, these template functions need to be special: they need to create bindings for the original value. For example

$(map $(echo "<something-that-refers-to-the-item>") $list)

or more ideally

$(map $(lambda (x) $(echo $x) $list)

U̶n̶f̶o̶r̶t̶u̶n̶a̶t̶e̶l̶y̶,̶ ̶t̶e̶m̶p̶l̶a̶t̶e̶s̶ ̶i̶n̶ ̶s̶y̶s̶l̶o̶g̶-̶n̶g̶ ̶c̶a̶n̶n̶o̶t̶ ̶e̶s̶t̶a̶b̶l̶i̶s̶h̶ ̶b̶i̶n̶d̶i̶n̶g̶s̶ ̶c̶u̶r̶r̶e̶n̶t̶l̶y̶.̶
As Bazsi expained below: $_ can be used for this purpose.

I am experimenting with this topic in: #3205.
This would refer to the non-lambda version. The list functions could anaphorically bind the list elements, similarly to aiterate in the linked PR.

The interesting part in aiterate is that it can be used to loop through list, though an inefficient way: O(n^2) where n for the length of the list. You can check the example in the PR.
Still, if we have aiterate, we need to just add a $(take k <template-function>), that rebuilds the list from the iterator. Then we would have the "map" functionality.

@bazsi
Copy link
Collaborator

bazsi commented Mar 29, 2020 via email

@furiel
Copy link
Collaborator

furiel commented Apr 2, 2020

Indeed, that is exactly what I was looking for. I am rebuilding iterate around that.

@jszigetvari
Copy link
Contributor Author

jszigetvari commented Apr 7, 2020

I don't know whether I understand it right, but from the usage examples over in #3205, it seems that this iteration can't be used, in the following use-case:
A message arrives, and through some parsing magic, a list is created, which belongs to that specific log message.
Through this iteration construct I would have liked for instance to only print out the list elements that match a certain pattern for example.

So for example, through the parsing, we would get a list like this:
\"ABC\",\"CDE\",\"ABD\"

And we would apply a filter pattern for the string "AB"

We would use a template like this:

template("$MSG <iteration construct where item matches "AB" prefix print item>")

And in the end we would get an output like this:
content-of-MSG ABC ABD

However the current solution seems to use a static list, where the iteration happens over subsequent incoming messages.
Sorry for not clearing this up eariler.

@bazsi
Copy link
Collaborator

bazsi commented Apr 7, 2020 via email

@furiel
Copy link
Collaborator

furiel commented Apr 8, 2020

@jszigetvari Yes it can be used for filter too. Almost. Creating a series of the matching prefixes is somewhat straightforward. Building a list from them is difficult with iterate, because either

  • we need a new $(take) template function, that I mentioned above.
  • we can write the concatenation with iterate in reduce-style. But then you would need a mechanism to consume an iterator n times: something like (iter-nth).

If you know the length of the list, then of course you can consume the iterator manually.

For fun and profit, I created an example config that shows the intention. It is very complex, but again, I did not want to use iterate for practical use for this use case. It is a proof of concept. After that is accepted, I can create the other proper template functions based on that.

The method is simple: with iterate, I generate a series of indexes that will be used to walk through the parsed list using list-nth. With the combination of $(if) and $(substr), I can create a series that walks through the list elements, but for the matching elements, it outputs "" instead of the item. For building a list from the series, I can again use iterate with list-concat, which would return the iterator that outputs the filtered version of the 'k'-th slice of the original list. If I consume the iterator length-times, the next evaluation returns the complete filtered list.

Some technical difficulties arised that I needed to deal with: I needed to create two copies for most of the functions to avoid multiple evaluation. In the if statement, there was a missing evaluation too: the true-branch only evaluated if the pattern matches, but I always needed to step the iterator. So I manually consumed in the else branch too. This kind of complexities can be addressed with a new $(let) template function. But the plan is, I first create map, filter, and extend find with a --key-function option. Only after that I can start working on let and reduce.

@version: 3.26

template-function index-copy1 "$(iterate $(% $(+ 1 $_) $(list-count $parsed)) 0)";
template-function element-copy1 "$(list-nth $(index-copy1) $parsed)";
template-function prefix-copy1 "$(substr $(element-copy1) 0 2)";

template-function index-copy2 "$(iterate $(% $(+ 1 $_) $(list-count $parsed)) 0)";
template-function element-copy2 "$(list-nth $(index-copy2) $parsed)";
template-function prefix-copy2 "$(substr $(element-copy2) 0 2)";

template-function nothing "$(substr $(element-copy2) 0 0)";

template-function return-item-if-prefix-AB "$(if '\"$(prefix-copy1)\" eq \"AB\"' $(element-copy2) \"$(nothing)\")";

template-function partial-concatenates "$(iterate $(list-concat $_ $(return-item-if-prefix-AB)) '')";
template-function consume-without-echo "$(substr $(partial-concatenates) 0 0)";
template-function joined "$(consume-without-echo)$(consume-without-echo)$(consume-without-echo)$(partial-concatenates)";


log {
    source { example-msg-generator(num(1) freq(0.001)); };
    rewrite { set("$(list-concat 'ABC' 'CDE' 'ABD')" value(parsed)); };
    destination { file(/dev/stdout template("$(joined)\n")); };
};

results in

$ timeout 1 ./syslog-ng -Fe -f ../etc/iterate.conf 
[2020-04-08T07:16:40.707777] syslog-ng starting up; version='3.26.1.139.g80c63bc.dirty'
ABC,ABD
[2020-04-08T07:16:41.684647] syslog-ng shutting down; version='3.26.1.139.g80c63bc.dirty'
$ 

@jszigetvari
Copy link
Contributor Author

After a discussion with @furiel I see now that the planned end-product of this effor will be something that will be useful in my use-case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants