Trending February 2024 # Changing Data Source Location In Power Query # Suggested March 2024 # Top 5 Popular

You are reading the article Changing Data Source Location In Power Query updated in February 2024 on the website Minhminhbmm.com. We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested March 2024 Changing Data Source Location In Power Query

Loading a Single Excel Workbook

If you are loading data from a file with a query like this

and then move the file to another folder or change the filename, you just need to change the path/filename in the Source step.

In Excel In Power BI Desktop

Loading Files from a Folder

If you are loading multiple files from a folder on your PC the process is very similar.

You’ll have a query like this

In both of these cases you can also modify the query directly in the Advanced Editor by changing the file and/or folder path.

Moving Files Into the Cloud

Another common scenario is where you’ve created some queries with files on your PC and you then move the files either to OneDrive for Business or Sharepoint Online.

The changes to be made here are still fairly straight forward but because the files are being moved to the cloud, we also have to change the connector being used in the queries.

Moving a File to OneDrive for Business

We’ve already seen that the query to load this Excel file from my PC is

Looking at the query in the editor you can see that after the Source step, I end up with this 5 column table.

The transformations that begin with the Expanded Data step will be the same whether I’m loading the file from my PC or OneDrive. What I need to do is change the source loading step so that after loading the file from OneDrive, I end up with this same 5 column table. The transformations can then do their work as they have the same data (the 5 column table) to work on.

I’ve moved the file to OneDrive into a folder called Sales

I need to know how to access this file and I do this by looking in my browser’s address bar. In the image below I’ve highlighted in red the important part that I need. This is the first part of the URL I need to use in Power Query.

%2F is HTML code for a forward slash so the end of the URL can also be read as /Documents/Sales

I have everything I need now to access the file in Power Query. The full URL to the file is

All I need to do is change this line

to this

If you’re prompted to enter credentials to let Power Query know how to connect to OneDrive.

The process is similar to change location for a text/CSV file.

Loading Files from a Folder

I’m loading several Excel workbooks from my PC with this query

After the Source step I have this table – it’s 8 columns wide but I’ve chopped out the middle to make it fit the screen.

All transformation steps act on this table so after moving the files elsewhere, I need to create a new query that gives me the same table, so my transformation steps do not need to be altered.

I have just moved these files to one of our company Sharepoint sites. Our Sharepoint looks like this

I’ve moved the files into the MOTH Team Site, into a folder called World Domination. Perhaps the folder name gives away my secret scheme.

Enter the URL for the Sharepoint root

Sign in with your Microsoft or Organization account if required

You’ll then be presented with a list of all the files on your Sharepoint

Now inside the PQ editor, I only want the files in my World Domination folder

So filter the Folder path column to only include that folder

This gives me a table with the same structure as I had when loading these files from my PC.

Loading the files from Sharepoint and creating this table was done in 2 steps, loading the source and filtering the folder path.

What I need to do now is take these steps from this query and insert them into the query that does all my transformations.

I need to make one other change which is to modify the #”Removed Other Columns” step. Originally it was referencing the Source step, but because I had to filter out some rows when loading from Sharepoint, I’ve now got a #”Filtered Rows” step after the Source step.

Nothing else needs changing so the query can be saved and my files should now load from Sharepoint.

You're reading Changing Data Source Location In Power Query

The Art Of Query Building: Data Problems To Sql Queries

Introduction

Learning Objectives 

Understand how data flows through a SQL query and use this to solve data problems.

Transform data problems into SQL queries using a keyword-based approach.

Dos and Don’ts when it comes to SQL keywords.

Finally, we’ll go through an example of using the underlying approach.

This article was published as a part of the Data Science Blogathon.

Table of Contents TABLE: Where Is My Data?

First, I like to start by considering all the tables I need in the query. You can do this by considering all the fields that will be needed to get the desired result, and then we can find them. An important thing to note is that multiple tables may have the same field. For example, user data can be present in multiple tables with different levels of aggregations. Hence, knowing what grain you want to pull in the results is essential. When building the query, I want you to pick one table, go through the steps, and return to the table. Also, if any array fields are needed in the table, now is a good time to unpack them.

FROM table_name LEFT JOIN UNNEST(table_array) AS array WHERE: What I Don’t Want?

Now that you know where your data is coming from, it’s time to know what information you need and, more importantly, what you don’t need from the table. So if the table has a partition or if the query demands filtering a certain type of record, now is the time to use it. Also, I need you to look at all fields in a table and think about all possible ways to filter your data here. You should really push yourself to add more filters.

To put it simply, the lesser data your query sees, the better it performs and avoids mistakes. Further, we often skip obvious filters as they seem too trivial; for example, if you’ve filtered on the partition date, it might still have multiple dates, so look for other date fields and add the filter.

WHERE partition_field = "date_value" AND col1 = "xyz" AND col2 IS NOT NULL ... GROUP BY: What’s the Grain?

Before you SELECT anything, I’d recommend doing a GROUP BY.  This is because having this first will often constrain what you select in your query. You can no longer do a `SELECT *`, which rarely makes sense. This will also leave out duplicate records before anything, and trust me; you don’t want duplicates flowing through your query as it’s difficult to determine their origin later. This also forces you to perform aggregations.

You often don’t need a field but only the aggregated value. Having this out of the way is helpful so that the rest of the query sees lesser data. So I’d recommend having a GROUP BY in your query for every table; even if it’s not explicitly needed, it’s an excellent way to avoid duplicates and only pulls in relevant data for the query.

SELECT col1, col2 FROM table_name GROUP BY col1, col2 SELECT: What Do I Actually Want?

After doing all the work above, you can now think about what fields you’ll actually pull from the specific table. If you have followed the above steps, the scope of the fields has already been reduced to the fields that are needed for the specific results.

A `SELECT *` slows down your query and may lead to incorrect results, as you may end up with extra records. The only time you should do it is when you’re trying to do a preview of all the fields in a table. On the contrary, selecting fewer fields first and then adding them later when needed is also feasible.

CASE: Conditions

A case statement is SQL’s way of making IF-ELSE statements. These enable you to capture complex logic and show SQL’s real ability. In addition to using CASE statements for traditional applications, you should also use them to alter fields before selection. For example, if you’re not concerned about a field’s specific value but only want a discrete value like Y/N, this is the time to convert the field using CASE statements.

One thing to note here is always to have an ELSE condition that tells SQL what to do if none of your conditions are met. We’re often confident that we’ve covered all the scenarios in our CASE statement, but data always surprises us. Hence it’s better to have an ELSE condition to avoid unknown behavior.  Personally, I like to add `ELSE NULL` so I can see that data didn’t fall into any of my expected scenarios.

CASE WHEN col = "value" THEN "Y" ELSE "N" END AS new_col Aggregations (Level 1): The Math

In this article, we’ll be talking about aggregations twice. At first, you should only worry about aggregations at a single table level. These are usually math-based, like sum, average, max, and min, or count-based. One thing to note for counts is that in 99% of the cases, you’d want to do a `COUNT(DISTINCT field_name)` instead of a regular `COUNT(field_name)` as the latter gives you a record count with duplicates in the specific field. A useful strategy is combining aggregations and CASE statements to capture complex logic in an easy manner. For example, building a purchase_indicator using the total transaction amount as below.

ALIAS: Nicknames

This may seem trivial, but this step is important for readability and writing correct queries. This is because many times, you’ll be deep down in your query looking for a field and not know what it is called. Hence it’s essential to make these worthwhile. Also, rather than using aliases for aggregated or derived fields, it’s helpful to use them for renaming fields with long or funky names in the table. In this way, even though you cannot do anything to the actual table, you can still call it something easy to work with in your own query.

Now if the query you’re building only uses a single table, this is where you stop. However, in most cases, there’ll be more than one table, so you can read further.

CTE: Building Blocks

CTEs or Common Table Expressions can build a temporary table inside your query without creating a real table. These are most useful for compartmentalizing your SQL query. This helps you to think clearly as every element becomes a separate table that can be combined.

At this point, you should put together all the above steps and wrap it in a CTE as done below. These also help in making changes to the query; for example, if you’re trying to edit certain conditions on a table, you can directly go to the associated CTE and make the change, enabling your change to cascade to the rest of your query.

WITH table_cte AS ( SELECT col1, chúng tôi AS col2_alias, FROM table_name LEFT JOIN UNNEST(table_array) AS array WHERE col4 = "value" GROUP BY col1, array.col2 )

Now go back to TABLEs and repeat the steps above for any other tables in your query.

JOINs: Watch Out SELECT col1, col2 FROM cte1 AS c1 JOIN cte2 AS c2 ON chúng tôi = c2.col1 GROUP BY col1, col2 Aggregations (Level 2): More Math

Now is the time to combine the metrics in the final result by aggregating the JOIN results. Because these will make our final results, it’s useful to throw in things like final aliases and FORMAT that make sure the numbers are readable with the appropriate commas.

SELECT FORMAT("%'d", SUM(amount)) AS total_amount ORDER BY: Make it Pretty

Ordering the results should always be saved for the last, as this can’t go in any CTE or subquery. The only time this can be avoided is when your query will be a production query where results are used and not read by someone. Otherwise, adding an `ORDER BY` is helpful, even if not explicitly required, as it will make reading the results much more accessible. Also, you can use fields here and CASE statements to allow for custom ordering of results.

LIMIT: Make it Digestible

Finally, if the plan with the query is to export or use the results to drive another calculation, you can skip this. However, in other cases, having the LIMIT clause is a must, which will only return a certain number of records, making it easier for you and your SQL engine. If you forget this and your query is about to return a million rows, your query will fail even without errors.

LIMIT 100 Putting It All Together

So let’s use our newly gained skills as an example. If you need more examples of queries with data and stories, head to my blog here.

The problem: We have an e-commerce store, and the marketing team wants a report of users who’ve not made a purchase in the last month. The state should break this down the user is in and the last interaction they had on the website.

WITH user_demographics AS ( SELECT user_id, address.state AS state FROM demographics LEFT JOIN UNNEST(address) AS address WHERE country = "USA" GROUP BY user_id, address.state ), user_purchases AS ( SELECT user_id, FROM transactions GROUP BY user_id ), SELECT * EXCEPT(rnk) FROM ( SELECT user_id, event, RANK() OVER(PARITION BY user_id, event ORDER BY date DESC) AS rnk ) t WHERE chúng tôi = 1 ), user_no_pruchases AS ( SELECT a.* FROM user_demographics a LEFT JOIN user_purchases b ON a.user_id = b.user_id WHERE (b.user_id IS NULL OR agg_purchase = "N") ), user_no_purchase_events AS ( SELECT user_id, state, event USING(user_id) GROUP BY user_id, state, event ) SELECT state, event, COUNT(DISTINCT user_id) AS user_count FROM user_no_purchase_events GROUP BY state, event ORDER BY state, event LIMIT 100 Conclusion

Here’s what we learned today:

We started by visiting the importance of SQL and building queries to solve business problems.

Then we delved into a step-by-step approach that leverages SQL keywords to transform data problems into queries.

In this, we highlighted common mistakes that go along with SQL keywords, for example, not having an `ELSE NULL` in a CASE statement.

We also reviewed best practices when writing SQL queries, including `GROUP BY`, to prevent duplicates.

Finally, we discussed an approach to query building using CTEs to compartmentalize your query.

Following these steps, you can transform any business problem into a SQL query that yields desired results.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Related

The Best Phone Number Tracker Apps With Location Data

There’s something truly secure about anonymity no matter what platform you are on, as it enables you a certain form of freedom. But when it comes to receiving anonymous phone calls on your mobile, anxiety and curiosity is the only thing that comes attached to it.

There are plenty of reasons why you’d want to find out the location of a specific number you’re receiving messages or phone calls from, and since there are ways to even create virtual phone numbers today, the need for phone number tracker apps is essential.

How to Block a Number on Google Duo

While accurate location tracking is limited to network carriers and federal agencies you don’t have to be James Bond to find the general location of a phone number. To begin with, all international calls come with their specific Country Code, so you can easily find out the region of the caller.

Additionally, the Google Play Store (APK) is packed with a bunch of apps that not only help you trace phone numbers but also provide location data to triangulate a general location of the caller easily.

How to find the location of origin of a phone number

Well, here are some great Android apps to help you with this.

The reason TruthFinder tops the list is not just glowing reviews it has, but due to the fact that it is one of the most comprehensive backgrounds check apps, you’ll find out there. You start off by signing up for the service, which presents you the ability to search someone as a Person, through a Phone number, look up someone’s Email, and even get information of people based on the Location.

Even with the free version of the TruthFinder app, you can find a surprising amount of detail of a person, including their given name, related contacts and possible relatives, location, email address, and even goes as far out to find out possible registered sex offenders in your neighborhood, so you’re safe at all times. With the premium subscription unlocked, you can find out the social profiles, assets, educational qualifications and so much more from just a person’s name!

Probably one of the most popular phone number tracking apps you’ll ever come across on the Google Play Store, Truecaller has been helping users dodge scammers and spam callers for years and discover unknown callers the easy way. This is essentially a Caller ID app on steroids, which creates a database of users and helps connect the dots across the entire user base to find the name of the caller, their regional location, and more.

You can use Truecaller as you default Phone and SMS app as well, and it will automatically block and reject unsolicited calls and messages. Using the Truecaller Search feature, you can look up a number, find the name associated with it, get their general location such as the Country and State they are in, along with associated details such as their email address as well, and immediately use the Block feature right then and there too.

Related: Tops WhatsApp status messages you could use to look trendy

While the two predecessors on this list offer a wide variety of features apart from caller and location tracking, Reverse Lookup sticks to the basics. The simplified user interface of the app makes it more approachable, while you do get the ability to look up a number directly from your recent calls or use the Manual Entry option to enter the number. Simply copy and paste the number into the search box and hit the Search button to look up the name of the caller, along with their regional location.

From here on, you can choose to Dial the number, Save Contact, Send Future Calls To Voicemail, and even use the Expanded Search Options to further investigate the phone number. There is even a Discussions tab where you can drop in questions or your experience with the caller, and you’ll usually find responses from other users when discussing a common spammer.

Keep a track of who’s using your Wi-Fi on Android

Since caller ID and blocking spam callers is a part of the entire communication category, most of the call tracking apps you’ll find out there tend to take over the call logs and messages as well. Showcaller is one such app that gives you a better way to find out who’s calling you from a phone number you may not have saved in your contacts, notifies you about a spam call right away so you don’t even have to bother picking it up.

This caller ID app comes with a separate Search section to help you find details of a contact based on their name or the phone number from their massive directory of over 4 billion contacts. Directly using the contact screen, you can choose to Save the contact, Block and Report it, and even find the location of the number. Showcaller even offers you an entire Comments section to help other users stay away from potential spammers.

Similar to the Truecaller app that maintains a directory of users to help find numbers and root out spam callers, Whoscall also offers a built-in SMS and Phone app that can take over the stock apps to help you manage unknown callers more efficiently. After comparing it with the others, we did notice that Whoscall was not that great at discovering the identity of personal contacts, but was more than efficient at finding work/commercial contact numbers. This call tracking service can not only help you easily find the identity of the caller, but pinpoint the location, offer opening/closing hours for public stores, and even find community-based reports for the listing.

Exploring Data With Power View Multiples

Exploring Data with Power View Multiples

Multiples, also called Trellis Charts are a series of charts with identical X and Y axes. You can arrange Multiples side by side, to compare many different values easily at the same time.

You can have Line charts, Pie charts, Bar charts and Column charts as Multiples.

You can arrange the Multiples horizontally or vertically.

Line Charts as Multiples

You might want to display the medal count by year for each Region. Firstly, you need to have the field Year. To get this field, you need to have a calculated column as follows −

Type =YEAR ([Edition]) in the formula bar and press Enter.

A new column with header CalculatedColumn1 is created with values corresponding to the Year values in Edition column.

Close the PowerPivot window. The Data Model gets updated. The new field – ∑ Year appears in the Power View Fields list.

Create a Table in Power View with fields NOC_CountryRegion, Count of Year and Medal Count, by dragging the fields.

Convert Table into a Line chart in Power View.

Remove the field NOC_CountryRegion. A Line chart appears with Medal Count by Year.

As you can observe, Year is in AXIS area and Medal Count is in ∑ VALUES area in Power View Fields list. In the Line chart, Year values are on X-axis and Medal count on Y-axis.

Now, you can create Multiples visualization with Line charts, as follows −

Drag the field NOC_CountryRegion to VERTICAL MULTIPLES area in the Power View Fields list.

You will get the Multiples Visualization with Line charts arranged as a grid, with each Line chart representing a country (NOC_CountryRegion).

Vertical Multiples

As you are aware, you have placed the NOC_CountryRegion field in the VERTICAL MULTIPLES area. Hence, the visualization that you have got is the Vertical Multiples visualization. You can observe the following in the chart given above.

One Line chart per category that is placed in VERTICAL MULTIPLES area, in this case – the country.

The grid height and grid width that you have chosen determine the number of rows and number of columns for the Multiples.

A common x-axis for all the multiples.

A similar y-axis for each row of the multiples.

A vertical scroll bar on the right side that can be used to drag the rows of Line charts up and down, so as to make the other Line charts visible.

Horizontal Multiples

You can have the Multiples Visualization with Horizontal Multiples also as follows −

Drag the field NOC_CountryRegion to VERTICAL MULTIPLES area.

Select the values for Grid Height and Grid Width in the Multiples group.

You will get the Horizontal Multiples visualization as follows −

You can observe the following in the above chart −

One Line chart per category that is placed in HORIZONTAL MULTIPLES area, in this case – the country.

The grid height that you have chosen determines the height of the Line charts, unlike the number of rows of Line charts as is the case in the VERTICAL MULTIPLES. In other words, there is a single row of Line charts with the height determined by the Grid Height that is chosen.

The grid width that you have chosen determines the number of columns of Line charts in the row.

A common x-axis for all the multiples.

A common y-axis for all the multiples.

A horizontal scroll bar at the bottom, below the x-axis, that can be used to drag the row of Line charts to the left and the right, so as to make the other Line charts visible.

Pie Charts as Multiples

If you want to explore / visualize more than one category in Multiples, Pie charts is an option. Suppose you want to explore the medal count by medal type for each of the countries. Proceed as follows −

Select Pie from the dropdown under Other Chart.

Drag Medal to the area SLICES.

You will get the Horizontal Multiples visualization with Pie charts, as you have the field NOC_CountryRegion in the area HORIZONTAL MULTIPLES.

As you can observe the medal-count for each country is displayed as a Pie chart with the slices representing the medal types with the color as given in the Legend.

Suppose you want to highlight the count of gold medals for all the countries. You can do it in a single step as follows −

As you can observe, this gives a fast way of exploring and comparing the count of gold medals across the countries.

You might want to display more number of Pie charts in a visualization. You can do it by simply switching over to Vertical Multiples Visualization and choosing the right values for Grid Height and Grid Width for a proper display.

Bar Charts as Multiples

You can choose Bar charts also for Multiples visualization.

Switch over to Stacked Bar visualization.

Adjust the Grid Height and Grid Width to get a proper display of the Bar charts.

With Grid Height of 6 and Grid Width of 2, you will get the following −

You can have Clustered Bar charts also for this visualization.

Column Charts as Multiples

You can choose Column charts also for Multiples visualization.

Switch over to Stacked Column visualization.

Adjust the Grid Height and Grid Width to get a proper display of the Column charts.

With Grid Height of 2 and Grid Width of 6, you will get the following −

You can have Clustered Column charts also for this visualization.

Wrap-up

The fields you choose depend on what you want to explore, analyze and present. For example, in all the visualizations above, we have chosen Medal for Slices that helped to analyze medal count by medal type. You might want to explore, analyze and present the data gender-wise. In such a case, choose the field Gender for Slices.

Once again, the visualization that is suitable also depends on the data you are displaying. If you are not sure about the suitability, you can just play around to choose the right one as switching across the visualizations is quick and simple in Power View. Moreover, you can also do it in the presentation view, in order to answer any queries that can arise during a presentation.

Advertisements

Coverity: Scanning Open Source Code

The process of software development is one with multiple layers. At the base layer is the code which developers write, which is then compiled by the build system that puts the code together so it is ready for deployment. Code analysis vendor Coverity is now expanding its analysis beyond just the static code layer to include the sometimes overlooked build system.

The new type of analysis could potentially help to reduce software defects across a wide array of applications. Coverity’s new system will first be made available to its commercial clients but will also find its way to Coverity’s open source scanning effort that has helped to eliminate over 8,500 software defects from open source software.

“The build system is essentially the assembly line for code,” Ben Chelf, CTO of Coverity, told chúng tôi “It takes all the pieces that developers write and puts them together. By analyzing the build system you’re going to find different things than what you’d find just by analyzing the code itself.”

Chelf explained that the way the Build Analysis software works is by watching how the software is built, as opposed to parsing the actual build configuration files themselves.

“What we do is we the make the observation that every build system has to make calls into the operating system and execute processes and all this information can be observed,” Chelf explained. “So we have over 80 different system calls to capture build information and we just have a wrapper script that sits there and watches. From that, can build up complete dependency graph.”

One item that was found during beta testing of the Build Analysis solution was repetitive system calls in the build process. In one example, Coverity found that a certain process was unnecessarily being executed 10,000 times.

Open Source Scanning

Coverity has been scanning open source code for software defects since 2006. Originally, the Coverity Scan effort was backed by the Department of Homeland Security, but it is currently being run and financially supported by Coverity itself. The Coverity Scan effort looks at several hundred open source projects in an effort to help find and fix software defects.

Chelf noted that the plan is to add the Build Analyzer to the open source scan effort soon, though he did not provide specific timing.

“It’s on our roadmap for open source scanning,” Chelf said. “It’s just a matter of checking it off the list.”

Chelf argued that the Coverity Build Analysis system is unique in the code analysis marketplace. That claim aside, Coverity competitor Klocwork claims that they too can now do build system analysis of a sort.

“Currently most of our build analysis technology is used to provide automated discovery of a customer’s build system in order to run effective, accurate code analysis,” Brendan Harrison, Klocwork’s director of marketing told chúng tôi “This is a must-have capability for deep static code analysis. In addition we’ve had numerous customers in the past use our analysis capabilities to optimize their build times through structural clean-up of their code.”

Protecting Against Open Source Vulnerabilities

The Coverity Build System also enables developers to insure that they are not unintentionally including vulnerable open source code into their builds by way of integration with code licensing analysis vendor Palamida.

Chelf explained that in partnership with Palamida’s software, a developer can examine the entire build process to identify if any vulnerable open source code is being used. Palamida maintains a database of up to date open source libraries and applications and can identify if an older, potentially vulnerable version of a given piece of open source code is being used.

The new code analysis from Coverity is complemented by the new Coverity Integrity Center product, which aims to tie in all the various pieces of code analysis to provide developers with a full view of what’s going on. In addition to Coverity’s Prevent code analysis, which performs static code analysis and the new Build Analysis, Integrity Center also pulls in the Architecture Analyzer, which was rolled out earlier this year.

“There are different ways to analyze software systems, from an architecture perspective from a build perspective and from a code perspective,” Chelf said. “You’ve got to analyze in as many ways as possible. All of these different perspectives enable us to find defects in different and interesting ways.”

Growing Inequality In Supercomputing Power

Supercomputing power is being concentrated in a smaller number of machines, according to the latest Top500 list of high-performance computers. Keepers of the list are uncertain how to parse that trend.

The first 17 entrants in the latest supercomputer ranking produce half of all the supercomputing power on the list, which totaled over 250 petaflop/s (quadrillions of calculations per second), noted Erich Strohmaier, an organizer of the Top500 twice-yearly ranking of the world’s most powerful supercomputers, speaking at a Tuesday evening panel at the SC2013 supercomputer conference,

The first place entrant alone, the Chinese Tianhe-2 system, brought in 33.86 petaflop/s (quadrillions of calculations per second).

“The list has become very top heavy in the last couple of years,” Strohmaier said. “In the last five years, we have seen a drastic concentration of performance capabilities in large centers.”

Confusing trend

The organizers of the Top500, however, are unsure if the trend bodes ill for supercomputing in general. Could it signal a decline in supercomputing overall, or a concentration of supercomputing’s investigative powers among fewer government agencies and large companies?

“We don’t know what it actually means,” said Horst Simon of Lawrence Berkeley National Laboratory, one of the organizers of the Top 500. “But it is important to exhibit the trend and have a discussion.”

To characterize the depth of this “anomaly” as Strohmaier called this trend, he used a measure of statistical dispersion called a Gini Coefficient, which ranks the distribution of some resource. The Gini Coefficient, which is often used to measure the wealth distribution of nations, can range from 0, where the resource is spread evenly among all the holders, to 1, where one party holds all of the resources.

The list scored a Gini Coefficient of 0.6, which is quite high, Simon noted. By way of comparison, were the Top500 supercomputers a nation, it would have a greater inequality in computation than all but a few of countries have today in terms of wealth distribution. Simon jokingly called it “the rich-getting-richer phenomenon of supercomputing.”

Drilling further down into the metrics, Strohmaier found no major differences between the buying habits of governments and industry. Both parties are buying fewer midsized systems and concentrating their efforts on building fewer, larger systems.

The trend could be problematic because fewer larger systems might reduce over time the number of administrators and engineers skilled in running high-performance computers. On the other hand, it might not be problematic in that most of the largest systems are shared across multiple users, such as all the researchers from a nation’s universities.

One member of the audience for the panel, Alfred Uhlherr, who is a research scientist for Australia’s Commonwealth Scientific and Industrial Research Organization (CSIRO), attributed the cause to another possible factor.

Non-participants

A number of organizations he knows of, both governmental and industrial, decline to participate in the Top500, knowing that their systems would not rank that high on the list. Nations such as China, or companies such as IBM, can generate positive publicity for themselves to be positioned near the top of the list. For entrants that might appear on the bottom reaches of the list, the benefits of getting on the list may not be worth the efforts.

Not helping in this regard is the sometimes laborious Linpack benchmark that supercomputers are required to run to be considered for the voluntary Top500.

For instance, the U.S. Department of Energy Lawrence Livermore National Laboratory’s Sequoia machine, which ranked third on the current listwith 17 petaflop/s, had to run Linpack for over 23 hours to get its results, noted Jack Dongarra, another one of the list’s curators, and a co-creator of Linpack.

Joab JacksonJack Dongarra

That night, Dongarra suggested that Linpack, created in the 1970s, is no longer the best metric to use to estimate supercomputer performance. He championed the use of a new metric he also helped to create, called the High Performance Conjugate Gradient (HPCG).

“In the 1990s, Linpack performance was more correlated with the kinds of applications that were run on the high performance systems. But over time that correlation has changed. There is mismatch now between what the benchmark is reporting and what we are seeing from applications,” Dongarra said.

Nonetheless, many in attendance at the conference still find the Linpack-driven Top500 viable. CSIRO’s Uhlherr said his organization still studies the list closely, not so much for the Linpack ratings, but to observe which industries, such as energy companies, are using supercomputers, as a way of assuring Australia is staying competitive in these fields.

Update the detailed information about Changing Data Source Location In Power Query on the Minhminhbmm.com website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!