Skip to content

add iterator based read_files_iter -> Iterator<Result<Directive>> method#115

Closed
jakobhellermann wants to merge 1 commit intojcornaz:mainfrom
jakobhellermann:iter-parser
Closed

add iterator based read_files_iter -> Iterator<Result<Directive>> method#115
jakobhellermann wants to merge 1 commit intojcornaz:mainfrom
jakobhellermann:iter-parser

Conversation

@jakobhellermann
Copy link
Copy Markdown
Contributor

It's often more convenient to work with an iterator, rather than a calllback function.
This PR adds beancount_parser::read_files_iter, which works like read_files except that it returns an iterator over all Entrys, recursively.

use beancount_parser::{read_files_iter, Entry};
use std::path::PathBuf;

let iter = read_files_iter::<f64>(std::iter::once(PathBuf::from("journal.beancount")));

for entry in iter {
    let entry = entry?;
    match entry {
        Entry::Directive(directive) => {
            println!("Found directive: {:?}", directive.content);
        }
        Entry::Option(option) => {
            println!("Found option: {} = {}", option.name, option.value);
        }
        Entry::Include(_) => unreachable!(),
        _ => {} // Entry is #[non_exhaustive]
    }
}

Do you have a preference for whether Import entries should be emitted as well?
I think I remember having a use case but I can't find it right now.

I lean towards yes, because that's less magic, preserves more information, and if you're only interested in Directives you're gonna have to pattern match either way.

@jcornaz
Copy link
Copy Markdown
Owner

jcornaz commented Feb 3, 2026

It's often more convenient to work with an iterator, rather than a calllback function.

often, yes. But not necessarily always. In this case, I am a bit skeptical that it provides enough benefits to justify the implementation complexity.

Could you show me an example of code for which using the iterator is tangibly better than using the callback version?

@jakobhellermann
Copy link
Copy Markdown
Contributor Author

Hm, to be honest for the actual code I ended up with it doesn't make much of a difference:

let mut directives = Vec::new();
for entry in beancount_parser::read_files_iter::<Decimal>(files) {
    if let Entry::Directive(directive) = entry? {
        directives.push(directive);
    }
}

vs

let mut directives = Vec::new();
beancount_parser::read_files::<Decimal>(files, |directive| {
    if let Entry::Directive(directive) = entry? {
        directives.push(directive);
    }
});

It would be more of an advantage if I wanted to collect all directives

let directives = beancount_parser::read_files_iter(files).collect::<Result<Vec<_>,_>>()?;

or ignore all errors

let directives = beancount_parser::read_files_iter(files).filter_map(Result::ok);

Actually, I just remembered a major advantage of the custom iterator: the ability to find out which files were loaded through the loaded method.
In my use case I not only parse the beancount files, but set up a file watch to reload them when any of them is changed:

let files = path.iter().map(Clone::clone);
let mut iter = beancount_parser::read_files_iter::<Decimal>(files);
for entry in iter.by_ref() {
    if let Entry::Directive(directive) = entry? {
        directives.push(directive);
    }
}

let loaded_files = iter.loaded();

let _watcher = notify::watch_files(loaded_files);

I don't think that would be possible with read_files. Even if the output included Import directives, it wouldn't be possible to associate them to which file they came from, and they would be relative to that. There could be a variant that returns that information, but at that point the implementation complexity would be no lower than this PR.

Also, one option I'd like to mention is that you could implement read_files on top of read_files_iter, that way you would only have one implementation of the core logic.

@jcornaz
Copy link
Copy Markdown
Owner

jcornaz commented Feb 5, 2026

It would be more of an advantage if I wanted to collect all directives

let directives = beancount_parser::read_files_iter(files).collect::<Result<Vec<_>,_>>()?;

That could be achieved in an even simpler way with the new read_files_to_vec (I just added in version 2.4.0):

let directives = beancount_parser::read_files_to_vec::<f64>(files)?;

or ignore all errors

let directives = beancount_parser::read_files_iter(files).filter_map(Result::ok);

Good point. The current API doesn't allow to ignore errors. I'll think about that.

Actually, I just remembered a major advantage of the custom iterator: the ability to find out which files were loaded through the loaded method.

Ah... ok. I get that. I have to think about what would be the best solution for this use-case specifically.

Also, one option I'd like to mention is that you could implement read_files on top of read_files_iter, that way you would only have one implementation of the core logic.

Yes, of course, and I would want to do it that way. But the read_files_iter would still have to be more complicated than the current implementation of read_files.

Thanks for your answers, they are good points. I'll think about it.

@jcornaz
Copy link
Copy Markdown
Owner

jcornaz commented Feb 14, 2026

In version 2.5.0, I made the read_files* functions to emit the (canonicalized) include directives instead of discarding them. And I also added a facility to read directly files into a BeancountFile.

In other words, it is now possible to get the list of path that was loaded. Either by handling the include directives, or by loading them into a BeancountFile and access the BeancountFile::includes field.

Of course, it is still not possible to filter or handle errors individually or partially load files that do not contain any error. Now, I am wondering, is this a use-case actually have?

@jakobhellermann
Copy link
Copy Markdown
Contributor Author

No I don't, I think this should be all that I need. Thanks!

@jcornaz
Copy link
Copy Markdown
Owner

jcornaz commented Feb 16, 2026

No I don't, I think this should be all that I need. Thanks!

Thanks to you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants