|
|
Thanks for your appreciation
|
|
|
|
|
First, from a design standpoint, collections should never be null. Yes, I know that a reference can be null, but if it's a reference to a collection it never should be. There just isn't a valid reason for it.
More importantly, though, extension methods should always fail if the "this" parameter is null. Otherwise it breaks the symmetry with member functions. So if you insist on checking for null, this should be a static method not an extension method.
William E. Kempf
|
|
|
|
|
ONE design standpoint is that collections should never be null. Another one is that they CAN be. There are many discussions on this on the web. It all depends on the situation you are trying to represent and how you want the code to reflect that.
|
|
|
|
|
Agreed but William is right in that a null object shouldn't be allowed to call a method on itself. Yes technically it works but it breaks the pattern and IMO shouldn't even be allowed. Which then leaves the issue of whether the collection is empty or not. Well there's already the Any() LINQ extension for that. Therefore this whole article is pointless
|
|
|
|
|
Nice trick and easy to implement. Thanks
|
|
|
|
|
Thanks, glad to know that you liked it.
|
|
|
|
|
Just a query re collection-vs-enumerable in extension methods: I know there's some issues with extension methods when working with polymorphic overrides. Have you actually debug stepped through your code to see if the "correct" variant is called on collections (i.e. a collection is also an enumerable)?
I'm definitely in agreement that you should not use "Count" on an enumerable to test if it's empty - that's just plain "silly". But how much faster is testing Count on a collection vs. the Any() function on an enumerable? Not to mention, it is conceivable that some person (idiot?) goes and implements an ICollection linked-list with its count property implemented through stepping through each node instead of the way the built-in LinkedList does it (i.e. store an int value keeping track of the count).
Perhaps you could just as well consolidate both those by simply omitting the collection version. Of course depending on if the performance hit on Any-vs-Count is not big enough to make much difference.
|
|
|
|
|
First of all thanks for your great words. It opens many doors to do research for me.
I believe many of them are in this sentence:
Hint: One will give me O(n) while other O(1), enjoy
irneb wrote: Just a query re collection-vs-enumerable in extension methods: I know there's some issues with extension methods when working with polymorphic overrides. Have you actually debug stepped through your code to see if the "correct" variant is called on collections (i.e. a collection is also an enumerable)?
Yes, I debug, unfortunately for large collection I can't get in depth using VS2013, I will try on VS2015.
irneb wrote: I'm definitely in agreement that you should not use "Count" on an enumerable to test if it's empty - that's just plain "silly".
Absolutely, correct!
irneb wrote: But how much faster is testing Count on a collection vs. the Any() function on an enumerable?
Need to perform some kind of Load testing, will try to add more content to this post, later this weekend.
irneb wrote: Perhaps you could just as well consolidate both those by simply omitting the collection version. Of course depending on if the performance hit on Any-vs-Count is not big enough to make much difference.
I would prefer either one is more performance.
Please share your thoughts, I will definitely try to make some kind of load testing for both scenarios.
@Mahsa-Hassankashi - can also provide some light on this. As she has already did few examples on collections.
modified 13-Aug-15 10:33am.
|
|
|
|
|
I've done a bit of testing, and yes the Count is a slight bit faster (around 67% of the time taken by Any). But a more beneficial alternative is Array Length - around 28% of the time taken by Any.
However:Something very-very strange. I've tried a benchmark of this on two systems, but get extreme disparity. E.g.
namespace EmptyOrNullTest
{
class Program
{
const long REPEAT = 0x3FFFFFF;
const int SIZE = 20;
static void Print(StreamWriter f, string name, long repeat, TimeSpan time)
{
var s = string.Format("{0}\t{1}\t{2}", name, REPEAT, Test(YieldIntAny, REPEAT));
f.WriteLine(s);
Console.WriteLine(s);
}
static void Main(string[] args)
{
using (StreamWriter f = new StreamWriter(@"Results.CSV", false, Encoding.UTF8)) {
Print(f, "YieldIntAny", REPEAT, Test(YieldIntAny, REPEAT));
Print(f, "EnumIntAny", REPEAT, Test(EnumIntAny, REPEAT));
Print(f, "EnumIntNullAny", REPEAT, Test (EnumIntNullAny, REPEAT));
Print(f, "ArrayIntAny", REPEAT, Test (ArrayIntAny, REPEAT));
Print(f, "ArrayIntLength", REPEAT, Test (ArrayIntLength, REPEAT));
Print(f, "ArrayIntNullAny", REPEAT, Test (ArrayIntNullAny, REPEAT));
Print(f, "ArrayIntNullLength", REPEAT, Test (ArrayIntNullLength, REPEAT));
Print(f, "ColIntAny", REPEAT, Test (ColIntAny, REPEAT));
Print(f, "ColIntCount", REPEAT, Test (ColIntCount, REPEAT));
Print(f, "ColIntNullCount", REPEAT, Test (ColIntNullCount, REPEAT));
Print(f, "ColIntNullAny", REPEAT, Test (ColIntNullAny, REPEAT));
Print(f, "EnumEx", REPEAT, Test (EnumEx, REPEAT));
Print(f, "ArrayEx", REPEAT, Test (ArrayEx, REPEAT));
Print(f, "ArEnEx", REPEAT, Test (ArEnEx, REPEAT));
Print(f, "ColEx", REPEAT, Test (ColEx, REPEAT));
Print(f, "ColEnEx", REPEAT, Test (ColEnEx, REPEAT));
}
Console.ReadLine ();
}
#region Helpers
void TestArrayLength ()
{
}
static TimeSpan Test(Action act, long repeat = 1)
{
if (repeat < 1) {
throw new ArgumentOutOfRangeException ("repeat", "must be a positive integer");
}
Stopwatch sw = new Stopwatch ();
act ();
act ();
System.GC.Collect ();
System.GC.WaitForPendingFinalizers ();
sw.Restart ();
for (long i = 0; i < repeat; i++) {
act ();
}
sw.Stop ();
return sw.Elapsed;
}
static Program()
{
ArrayInt = new int[SIZE];
ColInt = new List<int>();
for (int i = 0; i < SIZE; i++) {
ArrayInt[i] = i;
ColInt.Add(i);
}
EnumInt = ColInt;
}
static IEnumerable<int> YieldInt
{
get
{
for (int i = 0; i < SIZE; i++)
{
yield return i;
}
}
}
static IEnumerable<int> EnumIntNull = null;
static IEnumerable<int> EnumInt;
static int[] ArrayInt;
static int[] ArrayIntNull = null;
static ICollection<int> ColInt;
static ICollection<int> ColIntNull = null;
static bool TestEnumAny<T> (IEnumerable<T> en)
{
return (en == null) || !en.Any ();
}
static bool TestArrayLength<T> (T[] ar)
{
return (ar == null) || (ar.Length < 1);
}
static bool TestCollection<T> (ICollection<T> col)
{
return (col == null) || (col.Count < 1);
}
#endregion
static void YieldIntAny ()
{
bool b = TestEnumAny (YieldInt);
}
static void EnumIntAny ()
{
bool b = TestEnumAny (EnumInt);
}
static void EnumIntNullAny ()
{
bool b = TestEnumAny (EnumIntNull);
}
static void ArrayIntAny ()
{
bool b = TestEnumAny (ArrayInt);
}
static void ArrayIntNullAny ()
{
bool b = TestEnumAny (ArrayIntNull);
}
static void ArrayIntLength ()
{
bool b = TestArrayLength (ArrayInt);
}
static void ArrayIntNullLength ()
{
bool b = TestArrayLength (ArrayIntNull);
}
static void ColIntAny ()
{
bool b = TestCollection (ColInt);
}
static void ColIntNullAny ()
{
bool b = TestCollection (ColIntNull);
}
static void ColIntCount ()
{
bool b = TestCollection (ColInt);
}
static void ColIntNullCount ()
{
bool b = TestCollection (ColIntNull);
}
static void EnumEx ()
{
bool b = EnumInt.IsNullOrEmpty ();
}
static void ArrayEx ()
{
bool b = ArrayInt.IsNullOrEmpty ();;
}
static void ArEnEx ()
{
bool b = ((IEnumerable<int>)ArrayInt).IsNullOrEmpty ();;
}
static void ColEx ()
{
bool b = ColInt.IsNullOrEmpty ();
}
static void ColEnEx ()
{
bool b = ((IEnumerable<int>)ColInt).IsNullOrEmpty ();
}
}
}
And the extension methods I used in above:
public static class XEmptyOrNullEnumerables
{
public static bool IsNullOrEmpty<T>(this IEnumerable<T> self)
{
return (self == null) || !self.Any();
}
public static bool IsNullOrEmpty<T>(this T[] self)
{
return (self == null) || (self.Length < 1);
}
public static bool IsNullOrEmpty<T>(this ICollection<T> self)
{
return (self == null) || (self.Count < 1);
}
}
All this compiled to Release build using AnyCPU and then run from a console to avoid any IDE interference.
System 1:
MonoDev 5.5 on Kubuntu 14.04 (64bit), i7-2600 with 16GB RAM.
FunctionName Iterations Seconds
YieldIntAny 67108863 2.33333333333333
EnumIntAny 67108863 0.77893518518519
EnumIntNullAny 67108863 0.30902777777778
ArrayIntAny 67108863 0.84143518518519
ArrayIntLength 67108863 0.24537037037037
ArrayIntNullAny 67108863 0.32754629629630
ArrayIntNullLength 67108863 0.24537037037037
ColIntAny 67108863 0.51273148148148
ColIntCount 67108863 0.51620370370370
ColIntNullCount 67108863 0.31597222222222
ColIntNullAny 67108863 0.31597222222222
EnumEx 67108863 0.78703703703704
ArrayEx 67108863 0.25347222222222
ArEnEx 67108863 0.87731481481482
ColEx 67108863 0.55092592592593
ColEnEx 67108863 0.82175925925926
...
BUT
...
System 2:
VisualStudio 2013 Community Edition, Win8.1 Pro 64 bit, i7-5930K, 64GB RAM. DotNet frameworks 2.0, 3.0, 3.5, 4.0 & 4.5.1 installed.
FunctionName Iterations Seconds
YieldIntAny 67108863 1.62037037037037
EnumIntAny 67108863 1.94097222222222
EnumIntNullAny 67108863 1.94097222222222
ArrayIntAny 67108863 1.94212962962963
ArrayIntLength 67108863 1.94328703703704
ArrayIntNullAny 67108863 1.94212962962963
ArrayIntNullLength 67108863 1.94328703703704
ColIntAny 67108863 1.96412037037037
ColIntCount 67108863 1.94791666666667
ColIntNullCount 67108863 1.94791666666667
ColIntNullAny 67108863 1.93981481481482
EnumEx 67108863 1.94212962962963
ArrayEx 67108863 1.94212962962963
ArEnEx 67108863 1.94097222222222
ColEx 67108863 1.94444444444444
ColEnEx 67108863 1.94097222222222
It's as if something else is taking up so much time in the VS compiled variant that the test is meaningless.
|
|
|
|
|
Awesome! I really appreciate your work/help. I would like to take this example and will work with more complex scenarios. You're correct Length with Array is more performer. But its beyond of my scope. This post is related to one of my real-time example where we're not using Arrays.
I glad to get this input from you. Thanks and I will definitely update the code and contents with more complex examples.
If you want to add these contents to this post, I will be happy to work as a co-author with you, if code project allows.
@Deeksha-Shenoy - can we add co-author in this post?
|
|
|
|
|
 Seems I've made a "stupid" error ... trying to consolidate the print into both Console output as well as CSV in one function I've copy-pasted code and forgot to change the arguments.
Anyhow, did a revamp on the benchmarking test - so it's more of a general purpose test:
namespace Benchmarking
{
public class BenchmarkTest
{
#region Type Declarations
public static CultureInfo DefaultCulture;
static BenchmarkTest ()
{
DefaultCulture = new CultureInfo ("en-US");
}
public static readonly BenchmarkTest Default = new BenchmarkTest();
public struct Result : IComparable<Result>
{
public string Name = string.Empty;
public long Iterations = -1L;
public double Seconds = -1d;
public Result(string name, long iterations, double seconds)
{
this.Name = name;
this.Iterations = iterations;
this.Seconds = seconds;
this.singleSeconds = -1d;
}
public Result(string name, long iterations, TimeSpan time)
:this(name, iterations, time.TotalSeconds) {}
private double singleSeconds = -1d;
public double SingleIterationSeconds {
get {
if (singleSeconds < 0d) {
singleSeconds = Seconds / Iterations;
}
return singleSeconds;
}
}
#region IComparable implementation
public int CompareTo (Result other)
{
return this.SingleIterationSeconds.CompareTo (other.SingleIterationSeconds);
}
#endregion
}
public enum PrintType { ProgressOnly, RunningResults, None };
#endregion
#region Constructors
public BenchmarkTest(CultureInfo culture, bool writeToConsole, bool writeToDebug, StreamWriter logStream)
{
this.Culture = culture;
this.WriteToConsole = writeToConsole;
this.WriteToDebug = writeToDebug;
this.stream = logStream;
this.results = new List<Result> ();
}
public BenchmarkTest(bool writeToConsole, bool writeToDebug, StreamWriter logStream)
: this(DefaultCulture, writeToConsole, writeToDebug, logStream)
{}
public BenchmarkTest(bool writeToConsole, StreamWriter logStream)
: this(writeToConsole, false, logStream)
{}
public BenchmarkTest(StreamWriter logStream) : this(false, logStream) {}
public BenchmarkTest(bool writeToConsole, bool writeToDebug)
: this(writeToConsole, writeToDebug, null)
{}
public BenchmarkTest() : this(true, false) {}
#endregion
#region Settings
public CultureInfo Culture;
public bool WriteToConsole;
public bool WriteToDebug;
StreamWriter stream;
public Stream LogStream {
set {
stream = (value == null) ? null : new StreamWriter (value);
}
}
public bool WriteToStream { get { return (stream != null); } }
public PrintType PrintRunningResults = PrintType.ProgressOnly;
#endregion
#region Output
bool started = false;
void Print (string name, long iterations, TimeSpan time)
{
if (!started) {
PrintHeader ();
}
switch (PrintRunningResults) {
case PrintType.ProgressOnly:
if (WriteToConsole) {
Console.WriteLine (name);
}
if (WriteToDebug) {
Debug.WriteLine (name);
}
break;
case PrintType.RunningResults:
var s = string.Format (Culture, "{0,-30}{1,15}{2,20:F9}", name, iterations, time.TotalSeconds);
if (WriteToConsole) {
Console.WriteLine (s);
}
if (WriteToDebug) {
Debug.WriteLine (s);
}
if (WriteToStream) {
s = string.Format (Culture, "\"{0}\",{1},{2:R}", name.Replace (',', '.').Replace('\"', '\''),
iterations, time.TotalSeconds);
stream.WriteLine (s);
}
break;
}
}
void PrintHeader ()
{
if (PrintRunningResults == PrintType.RunningResults) {
var s = string.Format (Culture, "{0,-30}{1,15}{2,20}", "Name", "Iterations", "Seconds");
var bar = new String ('=', 30 + 15 + 20);
if (WriteToConsole) {
Console.WriteLine (s);
Console.WriteLine (bar);
}
if (WriteToDebug) {
Debug.WriteLine (s);
Debug.WriteLine (bar);
}
if (WriteToStream) {
s = string.Format (Culture, "\"{0}\",\"{1}\",\"{2}\"", "Name", "Iterations", "Seconds");
stream.WriteLine (s);
}
}
started = true;
}
public void Print()
{
if (results.Count < 1) {
return;
}
var s = string.Format (Culture, "{0,-30}{1,15}{2,20}{3,20}", "Name", "Iterations", "Seconds", "Relative");
var bar = new String ('=', 30 + 15 + 20 + 20);
if (WriteToConsole) {
Console.WriteLine (s);
Console.WriteLine (bar);
}
if (WriteToDebug) {
Debug.WriteLine (s);
Debug.WriteLine (bar);
}
if (WriteToStream) {
s = string.Format (Culture, "\"{0}\",\"{1}\",\"{2}\",\"{3}\"", "Name", "Iterations", "Seconds", "Relative");
stream.WriteLine (s);
}
Result fastest = Results[0];
foreach (var result in results) {
s = string.Format (Culture, "{0,-30}{1,15}{2,20:F9}{3,20:F9}", result.Name, result.Iterations,
result.Seconds, result.SingleIterationSeconds / fastest.SingleIterationSeconds);
if (WriteToConsole) {
Console.WriteLine (s);
}
if (WriteToDebug) {
Debug.WriteLine (s);
}
if (WriteToStream) {
s = string.Format (Culture, "\"{0}\",{1},{2:R},{3:R}", result.Name, result.Iterations,
result.Seconds, result.SingleIterationSeconds / fastest.SingleIterationSeconds);
stream.WriteLine (s);
}
}
}
#endregion
#region Testing
Stopwatch sw = new Stopwatch ();
protected List<Result> results;
public TimeSpan Run (string name, Action act, long iterations = 1)
{
if (iterations < 1) {
throw new ArgumentOutOfRangeException ("iterations", "must be a positive integer");
}
act ();
act ();
GC.Collect ();
GC.WaitForPendingFinalizers ();
act ();
sw.Restart ();
for (long i = 0; i < iterations; i++) {
act ();
}
sw.Stop ();
Print (name, iterations, sw.Elapsed);
var result = new Result (name, iterations, sw.Elapsed);
sorted = false;
results.Add (result);
return sw.Elapsed;
}
public TimeSpan Run (Action act, long iterations = 1)
{
return Run (act.GetMethodInfo ().Name, act, iterations);
}
public TimeSpan RunAuto (string name, Action act, double seconds = 1.0d)
{
if (seconds < 0.0) {
throw new ArgumentOutOfRangeException ("seconds", "cannot be negative");
}
act ();
act ();
GC.Collect ();
GC.WaitForPendingFinalizers ();
act ();
long iterations = 0;
sw.Stop ();
sw.Reset ();
while (sw.Elapsed.TotalSeconds < seconds) {
iterations = (iterations < 1) ? 1 : iterations * 2;
sw.Restart ();
for (long i = 0; i < iterations; i++) {
act ();
}
sw.Stop ();
}
Print (name, iterations, sw.Elapsed);
var result = new Result (name, iterations, sw.Elapsed);
sorted = false;
results.Add (result);
return sw.Elapsed;
}
public TimeSpan RunAuto (Action act, double seconds = 1.0d)
{
return RunAuto (act.GetMethodInfo ().Name, act, seconds);
}
#endregion
#region Results
bool sorted = false;
public IList<Result> Results {
get {
if (!sorted) {
results.Sort ();
sorted = true;
}
return results;
}
}
public void Clear()
{
results.Clear ();
sorted = false;
started = false;
}
#endregion
}
}
Then the testing function using the same class as my previous post:
static void Main(string[] args)
{
using (StreamWriter f = new StreamWriter(@"Results.CSV", false, Encoding.UTF8)) {
var b = new Benchmarking.BenchmarkTest (true, f);
foreach (var act in new Action[] {YieldIntAny, EnumIntAny, EnumIntNullAny, ArrayIntAny,
ArrayIntLength, ArrayIntNullAny, ArrayIntNullLength, ColIntAny, ColIntCount,
ColIntNullAny, ColIntNullCount, ColEx, ColEnEx, ArEnEx, ArrayEx, EnumEx}) {
b.RunAuto (act);
}
b.Print ();
}
Console.ReadLine ();
}
A bit more useful results:
Name Iterations Seconds Relative
=====================================================================================
ArrayEx 536870912 1.593437600 1.000000000
ArrayIntNullLength 536870912 1.601223200 1.004886040
ArrayIntLength 536870912 1.607353500 1.008733257
ArrayIntNullAny 268435456 1.087160600 1.364547441
ColIntNullCount 268435456 1.150170300 1.443633940
ColIntNullAny 268435456 1.165076300 1.462343176
EnumIntNullAny 268435456 1.169268900 1.467605509
ColIntCount 134217728 1.046758100 2.627672649
ColEx 134217728 1.046950500 2.628155630
ColIntAny 134217728 1.054542500 2.647213797
ArEnEx 134217728 1.449512200 3.638704647
ArrayIntAny 134217728 1.459083800 3.662732196
EnumEx 134217728 1.484445900 3.726398574
EnumIntAny 134217728 1.518965900 3.813053991
ColEnEx 134217728 1.529009200 3.838265647
YieldIntAny 33554432 1.022207800 10.264176520
Seems forcing the Any test on collections is pretty similar to using its Count property.
|
|
|
|
|
It looks great.
Do you want me to add this example in the post,if yes, I will add is it without modifications under your name?
|
|
|
|
|
I don't mind at all. You can do so if you wish, AFAI'm concerned posting it here already meant I gave up any "ownership" of it.
|
|
|
|
|
irneb wrote: I don't mind at all. You can do so if you wish, AFAI'm concerned posting it here already meant I gave up any "ownership" of it.
Awesome! I will add these tests in post with your Name
|
|
|
|
|
Could someone please test it on their systems? Is there really such a discrepancy between Mono and DotNet? If so it actually seems as if Mono performs better on average - perhaps I'm just doing something wrong here.
I've just redone this test on my Windows machine (W8.1 with VS 2013 CE):
Name Iterations Seconds Relative</h1>
ColIntNullAny 536870912 1.413856600 1.000000000
ArrayIntNullLength 536870912 1.413859000 1.000001697
EnumIntNullAny 536870912 1.414000600 1.000101849
ArrayIntNullAny 536870912 1.414018700 1.000114651
ColIntNullCount 536870912 1.418061500 1.002974064
ArrayIntLength 536870912 1.555343400 1.100071535
ArrayEx 536870912 1.555374400 1.100093461
ColEx 268435456 1.282051600 1.813552520
ColIntCount 268435456 1.285510400 1.818445237
ColIntAny 268435456 1.289521800 1.824119646
YieldIntAny 67108864 1.397846600 7.909410898
ArrayIntAny 67108864 1.550962800 8.775785607
EnumEx 33554432 1.058709000 11.980949129
ColEnEx 33554432 1.059109800 11.985484808
EnumIntAny 33554432 1.060733900 12.003864041
ArEnEx 33554432 1.768205000 20.010006673
|
|
|
|
|
I head the same things from other dev. It looks Windows resources were impacting results.Mono is faster.
I'll give it a try this weekend and put your code in the post with a complete test-patch and other stuffs.
|
|
|
|
|
I tried several time but having some code formatting issues.
Can you please give me your emailID and CP profile id, I will add you as a co-author so, you can also edit this article and add your code/contents.
|
|
|
|
|
|
|