Short-circuit evaluation is a straight forward subject for lazy evaluation on Boolean expressions, I noted that two factors could be useful for ordering Boolean functions to gain run-time. But this is applicable in a particular hypothesis:
Knowing a percentage of positive (hence negative) result of a function. the following is a hypothesis followed be an example.
Knowing execution run-time of the function on data (knowing size of data can be a hint)
My case is easy to understand, here is a description:
Let's say F is a Boolean function that scores sentiment and returns -1 or 1. I've been doing some improvements to predict sentiment scores on big arrays of text, and I can calculate how is likely a function will return 1 (probability withing a tolerated error range) lets call it P(f) = Pf (probability of being positive)
We have also an estimation of execution time since data to be analysed is measurable. lets call it T(f) = Tf we define positiveness speed as follows:
PS (f) = Pf / Tf
Let Lazy be a function which takes a Boolean expression, and gives the best ordering of functions for run-time.
Lazy (and, f1, f2, …, fn) = and_Order (f1, f2, …, fn)
Where
and_Order is ascending of positiveness speed.
Lazy (or, f1, f2, …, fn) = or_Order(f1, f2, …, fn)
Where
or_Order is descending if positiveness speed.
Obviously this is a high level optimization and not a compiler task. Is this an established study? How is this practical in cases other than sentiment analysis? Is the PS(F) even correctly representing what it should be representing?
Edit in response to ppolymorphe answer:
Quote:
Quote:
First of all, Boolean values are 0 for False and 1 for True. In computers, the usage to use 0 for False and non 0 for True.
There is nothing that holds usage of Boolean type for representing two values types. Even if it doesn't hold truth meaning. In my case It does. Here is a scenario:
After an event called "Innovation week", We wants to see if visitors were satisfied at many levels: "organization", "subjects", "proposed solutions"…
Let’s say that "organization", "subjects", "proposed solutions" are three topics that the most represent people's reaction (sentiment) on "Innovation week event". "Organization" and "subjects" are on the same level and one is sufficient to satisfy the other (logical OR), while both grouped are in the same level at "proposed solutions" (logical AND).
Possible modelization:
Sent ("Innovation week") = (Sent("organization") || Sent("subjects")) && Sent("proposed solutions")
Quote:
Short-circuit evaluation is a straight forward subject for lazy evaluation on Boolean expressions,
Where Sent is polar sentiment function (Boolean)
Quote:
No, it is the efficient say to Boolean expression evaluation.
That's True. I didn't give attention to.
Quote:
your reordering must be faster, unfortunately, it is impossible.
Reordering at runtime imply evaluating all your slow functions before reordering, there is no gain there.
That's True.
But, I will be evaluating a sample (say 1/10 of the input big array), therfore I will accept error range. That's why
Quote:
Where and_Order is ascending on positiveness speed.
Where positiveness speed is a probability (based on sampling).
Thank you.
What I have tried:
This is an optimization I thought about, but I'm not sure about it's feasibility neither it's correctness