Skip to main content

ML.NET Model Builder November Updates

ML.NET is an open-source, cross-platform machine learning framework for .NET developers. It enables integrating machine learning into your .NET apps without requiring you to leave the .NET ecosystem or even have a background in ML or data science. ML.NET provides tooling (Model Builder UI in Visual Studio and the cross platform ML.NET CLI) that automatically trains custom machine learning models for you based on your scenario and data.

This release of ML.NET Model Builder brings numerous bug fixes and enhancements as well as new features, including advanced data loading options and streaming training data from SQL.

In this post, we’ll cover the following items:

  1. Advanced data loading options
  2. Streaming from SQL Server with Database Loader
  3. Feedback
  4. Get started and resources

Advanced data loading options

Previously, Model Builder did not offer any data loading options, relying on AutoML to detect column purpose, header, and separator as well as decimal separator style.

Let’s take a look at the new advanced data loading options in Model Builder using the taxi fare dataset. This is a regression problem where you predict the taxi fare amount based on several factors like distance traveled, payment type, and number of passengers.

In Model Builder, after selecting the Value prediction scenario and the local training environment, you’ll end up on the Data step. Choose File as the Data source type, browse for the taxi fare dataset, and once the dataset is selected, change the Column to predict (Label) to fare_amount.

Data step in Model Builder

Select Advanced data options to open the advanced data loading options dialog.

Advanced data options column settings

In this dialog, there are two sections- Column settings and Data formatting.

Column settings

In the Column settings section, you can change the column purpose of each Feature column (columns which are used to predict the Label) to Categorical, Text, Numerical, or Ignore:

  • Categorical columns contain data that is in a discrete number of labeled groups. For instance, Payment Type, which can be CSH (cash) or CRD (card) would be Categorical.
  • Text columns contain strings in the form of free-form text. For example, if you had a model that predicted if reviews left by taxi passengers about their ride was positive or negative, the column which contains the free-form comments would have a column purpose of Text.
  • Numerical columns contain numbers only (floating point or integers). In the taxi fare example, trip distance and trip time are both Numerical columns.
  • You can Ignore columns that you don’t want to use for training.

Normally, Model Builder does a suitable job of determining the column purpose, but there are cases where it might infer incorrectly or might choose a column purpose that gives slightly worse model performance. For instance, in the taxi fare example, Model Builder chooses Categorical for the passenger_count column, but this could also be a Numerical column.

You can try training with the default settings chosen by Model Builder and then try changing the Column purpose of passenger_count to Numerical to see how it affects the model’s performance.

Advanced data options changing column purpose

Data formatting

In the Data formatting section, you can override the following data loading options chosen by Model Builder:

  • Whether the dataset has column headers or not
  • The column separator (comma, semicolon, or tab)
  • The decimal separator (decimal dot or comma)

Advanced data options data formatting

As soon as you save the Data formatting options, you can see how it affects the dataset in the Data Preview.

Streaming from SQL Server with Database Loader

Model Builder now takes advantage of the Database Loader!

Previously, if your training data was stored in SQL Server, Model Builder would download the data locally and then train. Now, Model Builder will load and train data directly from SQL Server without needing to load all the data in-memory, so it can handle huge datasets up to terabytes in size.

Feedback

We would love to hear your feedback!

If you run into any issues, please let us know by creating an issue in our GitHub repos (or use the new Feedback button in Model Builder!):

Get started and resources

Get started with ML.NET in this tutorial.

Learn more about ML.NET and Model Builder in Microsoft Docs.

Tune in to the Machine Learning .NET Community Standup every other Wednesday at 10am Pacific Time.

The post ML.NET Model Builder November Updates appeared first on .NET Blog.



source https://devblogs.microsoft.com/dotnet/ml-net-model-builder-november-updates/

Comments

Popular posts from this blog

Fake CVR Generator Denmark

What Is Danish CVR The Central Business Register (CVR) is the central register of the state with information on all Danish companies. Since 1999, the Central Business Register has been the authoritative register for current and historical basic data on all registered companies in Denmark. Data comes from the companies' own registrations on Virk Report. There is also information on associations and public authorities in the CVR. As of 2018, CVR also contains information on Greenlandic companies, associations and authorities. In CVR at Virk you can do single lookups, filtered searches, create extracts and subscriptions, and retrieve a wide range of company documents and transcripts. Generate Danish CVR For Test (Fake) Click the button below to generate the valid CVR number for Denmark. You can click multiple times to generate several numbers. These numbers can be used to Test your sofware application that uses CVR, or Testing CVR APIs that Danish Govt provide. Generate

How To Iterate Dictionary Object

Dictionary is a object that can store values in Key-Value pair. its just like a list, the only difference is: List can be iterate using index(0-n) but not the Dictionary . Generally when we try to iterate the dictionary we get below error: " Collection was modified; enumeration operation may not execute. " So How to parse a dictionary and modify its values?? To iterate dictionary we must loop through it's keys or key - value pair. Using keys

How To Append Data to HTML5 localStorage or sessionStorage?

The localStorage property allows you to access a local Storage object. localStorage is similar to sessionStorage. The only difference is that, while data stored in localStorage has no expiration time untill unless user deletes his cache, data stored in sessionStorage gets cleared when the originating window or tab get closed. These are new HTML5 objects and provide these methods to deal with it: The following snippet accesses the current domain's local Storage object and adds a data item to it using Storage.setItem() . localStorage.setItem('myFav', 'Taylor Swift'); or you can use the keyname directly as : localStorage.myFav = 'Taylor Swift'; To grab the value set in localStorage or sessionStorage, we can use localStorage.getItem("myFav"); or localStorage.myFav There's no append function for localStorage or sessionStorage objects. It's not hard to write one though.The simplest solution goes here: But we can kee